Ligation and/or assembly of nucleic acid molecules

ABSTRACT

The present invention relates to ligation and/or assembly of nucleic acid molecules. Particularly, a double-stranded target nucleic acid having overhangs of at least one nucleotide is ligated with another nucleic acid molecule capable of forming a stem-loop structure with an overhang of at least one nucleotide. The invention is suitable for tagging nucleic acid molecules. In specific embodiments, the overhangs can be produced by chemical cleavage of phosphorothioate-modified nucleic acid molecules. The invention further relates to the amplification of the ligated product, and using the resultant amplicon for assembly of multiple nucleic acid fragments.

FIELD OF THE INVENTION

The present invention relates to the field of molecular biology, forexample recombinant nucleic acid technology. In particular, the presentinvention relates to ligating, joining and/or assembly of nucleic acidmolecules.

BACKGROUND OF THE INVENTION

Recombinant nucleic acid technology and/or genetic engineering istransforming how humans generate fuels¹, produce bulk chemicals², andtreat diseases³. Most recombinant nucleic acid technology and/or geneticengineering works require construction of plasmid, a vector for carryinggenetic information. Currently, tens of thousands of researchlaboratories across the globe are constructing plasmids every day in ahighly inefficient way—researchers customize materials they need, paycommercial companies to synthesize them from scratch, wait for thematerials to be delivered, and assemble the materials in their ownlaboratories. Some foundries have been built to solve this problem byusing automation⁴, but they have not addressed the central piece of theproblem, that is use of customized Biological Parts⁵ (BPs).

More than a decade ago, BioBricks Foundation was founded in the US toprovide standardized BPs to the public. However, it has become lesspopular⁵, because of the following flaws in it and corresponding DNAassembly method: (1) large scars (conserved, useless nucleotides) areleft between BPs, which may affect biological function of BPs^(6,7); (2)only two BPs can be assembled in one round, resulting in long plasmidconstruction time; (3) certain restriction enzyme recognition sites mustbe avoided in sequence of BPs. Recently, a newer standard (BASIC) hasbeen published⁷, allowing multiple-fragment and multi-tier assembly,which however have only solved some of these problems and have not beenwidely adapted.

Existing methods such as Golden gate¹⁵ and BASIC⁷ do not satisfactorilyaddress these problems.

It is therefore desirable to develop new improved recombinant nucleicacid and/or genetic engineering methodologies.

SUMMARY OF THE INVENTION

According to a first aspect, the present invention provides a method forligating at least two nucleic acid molecules comprising:

-   -   (i) providing a first nucleic acid molecule comprising a first        overhang of at least one nucleotide in length at a first end;    -   (ii) providing a second nucleic acid molecule capable of forming        a stem-loop structure with an overhang of at least one        nucleotide; wherein the overhang of the second nucleic acid        molecule is substantially complementary to the first overhang of        the first end of the first nucleic acid molecule; and    -   (iii) ligating the first nucleic acid molecule to the second        nucleic acid molecule at the complementary overhangs to form a        single nucleic acid molecule.

According to a second aspect, the present invention provides a methodfor ligating three nucleic acid molecules comprising:

-   -   (i) providing a first nucleic acid molecule comprising a first        overhang of at least one nucleotide in length at a first end and        a second overhang of at least one nucleotide of at least one        nucleotide in length at its other (or second) end; wherein the        first overhang and the second overhang have different sequences        and/or are not complementary to each other;    -   (ii) providing a second nucleic acid molecule capable of forming        a stem-loop structure with an overhang of at least one        nucleotide; wherein the overhang of the second nucleic acid        molecule is substantially complementary to the first overhang of        the first end of the first nucleic acid molecule; and also        providing a third nucleic acid molecule capable of forming a        stem-loop structure with an overhang of at least one nucleotide;        wherein the overhang of the third nucleic acid molecule is        substantially complementary to the second overhang of the second        end of the first nucleic acid molecule; and wherein the overhang        of the second nucleic acid molecule and the overhang of the        third nucleic acid molecule have different sequences and/or are        not complementary to each other; and    -   (iii) ligating the first overhang at the first end of the first        nucleic acid molecule to the overhang of the second nucleic acid        molecule and also the second overhang of the second end of the        first nucleic acid molecule to the overhang of the third nucleic        acid molecule to form a single nucleic acid molecule.

The invention includes a nucleic acid molecule comprising of a definedsequence capable of forming a stem-loop structure with an overhang ofone nucleotide.

The invention also includes a kit comprising a plurality of nucleic acidmolecules; each with a defined sequence capable of forming a stem-loopstructure with an overhang of at least one nucleotide. The kit mayfurther comprise one or a plurality of oligonucleotide(s)

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1: Diagram of Universal DNA-Assembly Standard (UDS). Barcoding isadding specific barcode to each side of a fragment. We have developedtechnology that can perform barcoding without leaving any scar betweenbarcode and fragment. Barcodes facilitate assembly of fragments.Composite parts are parts that are produced by assembly of simple ones,and can be easily reused in new rounds of DNA assembly according to UDS.

FIG. 2: 1 nucleotide barcoding method. (a) Workflow for creating 1-mersticky end-containing fragments; *: phosphorothioate bond; —:phosphodiester bond. After chemicals treatment, 3′ end overhang of ‘C’and ‘T’ generated at the second end and at the first end, respectively.(b) Workflow for creating 1-mer sticky end-containing entry vector; *:phosphorothioate bond; —: phosphodiester bond. After chemicalstreatment, 3′ end overhang of ‘A’ and ‘G’ generated at the second endand at the first end, respectively.

Comparison of three methods that barcode 1-mer sticky end-containingfragments: (c) Using entry vector: entry vector included two barcodes(shown as two semicircles) on its sides, and was prepared by using PCR.Similar to workflow in a, two 1-mer sticky ends were introduced to theentry vector, allowing it to be ligated to any fragment that containscompatible sticky ends. After the ligation, barcoded fragment wasamplified by using two oligos targeting barcode regions (shown as twoblack arrows) for downstream DNA assembly. (d) Annealed oligos: eachbarcode with 1-mer sticky end was generated by annealing twocomplementary oligos, one of which was shorter than the other one by onenucleotide. The two barcodes (shown as semicircles) were ligated to afragment that contained compatible sticky ends, and the ligation productwas amplified by using two oligos (shown as two black arrows). (e)Stem-loop oligos: the two oligos used to create a barcode were made intoone by them with a loop region, and the two barcodes (shown assemicircles) were ligated to a fragment that contained compatible stickyends, and the ligation product was amplified by using two oligos (shownas two black arrows). The 3 methods were used to barcode 5 fragments,which can be used to create a CRISPR-Cas9 based knockout plasmid. 1:Antibiotic resistance marker (1.2 kb); 2: Replication origin (0.9 kb);3: Guide RNA (0.2 kb); 4: Homologous arm 1 (0.5 kb); 5: Homologous arm 2(0.5 kb). (f) The 5 fragments barcoded by using stem-loop oligos weresuccessfully assembled into a plasmid through long sticky end mediatedligation. How to generate long SEs and assemble two BFs with long SEsare elaborated in FIG. 6. 8 randomly picked colonies were confirmed tobe correct by colony PCR. Further sequencing and functionality testswere also positive (FIG. 7).

FIG. 3: Optimization of isoprenoid biosynthesis pathway to overproducevalencene in E. coli. (a) Central metabolism and valencene biosynthesis.Glc: glucose; Gap: glyceraldehyde 3-phosphate; Pep: phosphoenolpyruvate;Pyr: pyruvate; AcCoA: acetyl-CoA; Cit: citrate; Mal: malic acid; Oaa:oxaloacetate; Dxp: deoxy-xylulose 5-phosphate; Mec: methylerythritolcyclodiphosphate; Hmbpp: hydroxylmethylbutenyl diphosphate; Ipp:Isopentenyl diphosphate; Dmapp: dimethylallyl diphosphate; Fpp: farnesyldiphosphate; Dotted arrow indicates multi-step reaction; black arrowindicates single-step reaction. Underlined fonts in italic with indicateenzyme-encoding genes for the related reactions. Val: valencene. Fourgenes (dxs, ispA, idi and valC) in an operon will be shuffled in order,generating 24 (4!) variants, and then can be easily constructed by using5 fragments (the 4 genes and 1 backbone plasmid) and 5 barcodes (RBS1,RBS2stop, RBS3stop, RBS4stop and 3UTR). (b) Screening 24valencene-producing E. coli (MG1655_ΔrecA_ΔendA_DE3, IPS1-IPS24)strains, and the inducer of IPTG with varied concentration (0, 0.005 and0.1 mM) were added for induction of gene expression for 72 hours.

FIG. 4: Optimizing expression level of two genes (mutants of tyrA andaroG that were created to resist feedback inhibition imposed bytyrosine) for overproducing tyrosine in E. coli. (a) Central metabolismand tyrosine biosynthesis. Glc: glucose; G6p: Glucose 6-phosphate; Gap:glyceraldehyde 3-phosphate; Pep: phosphoenolpyruvate; Pyr: pyruvate;AcCoA: acetyl-CoA; Cit: citrate; Mal: malic acid; Oaa: oxaloacetate;Ppp: Pentose phosphate pathway; E4p: Erythrose 4-phosphate; Dahp:3-Deoxy-D-arabinoheptulosonate 7-phosphate; Pp: prephenate; Hpp:4-hydroxyphenylpyruvate; Tyr: tyrosine. Dotted arrow indicatesmulti-step reaction; black arrow indicates single-step reaction.Underlined fonts in italic with indicate enzyme-encoding genes for therelated reactions. A small UDS library to express them at differentlevels, combining 2 promoters (T7 promoter with LacI expression cassetteand Lac promoter), 2 genes (mutants of tyrA and aroG) orders and 4replication origins (p5: low copy number; p15: medium copy number; pMB1:medium copy number; pMB1 mutant from pUC19: high copy number),spectinomycin antibiotic resistance and LacI expression cassette for 7BFs assembly of plasmid with Lac promoter) and 7 barcodes (RBS1,RBS4stop, 3UTR, N21, N24, N22 and N23, leading to 16 (2×2×4) variants.AR: Antibiotic resistance; RO: Replicative Origin; Pro: promoter; Ter:terminator. (b) Screening of 16 tyrosine-producing E. coli(MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3) strains with induction of 0.1 mMIPTG for 84 hours.

FIG. 5: Detailed illustration of adding barcodes to 1-mer SE basedfragments without leaving scar in final construct.

FIG. 6: Detailed illustration of fragment barcoding and creation of longsticky ends (15-20 bp). *: phosphorothioate bond; —: phosphodiesterbond. DNA part 1 and part 2 with reverse complementary long SE can beannealed at 45° C. and nick will be repaired by a Taq ligase.

FIG. 7: Sequencing and testing functionality of the plasmid constructedby assembling 5 fragments. The procedure was described in FIG. 2. Theplasmid can knock out gene nupG in E. coli genome with aid ofCRISPR-Cas9. Plasmids from 2 randomly selected colonies of 8 colonieswere sequenced, and all junction regions (Barcodes) in were correct. Thefunction of the plasmid was also verified by testing its efficiency ofgene deletion, and colony PCR results indicated that nupG was deleted inall 6 colonies based on length of the bands and comparison withwild-type (C10: wild-type E. coli, MG1655_DE3; 1-6: colonies picked fromthe tested plate).

FIG. 8: (a) Details of the integration module, which can be used to genedeletion test. (b) The pre-constructed plasmid from above integrationmodule can be obtained by PCR, then can be barcoded to assembleinsertion to construction gene insertion test. DHS: downstreamhomologous sequence; UHS: upstream homologous sequence; gRNA: guide RNA;PB: promoter barcode (promoter as a barcode); P: promoter; N21-24:non-functional barcode.

FIG. 9: Existing method leaves scars during barcoding becauserestriction enzyme was used. In this method, each barcode is created byannealing two oligos, and must be at least 33 mer long, otherwise thetwo oligos cannot form stable DNA duplex.

FIG. 10: GT assembly standard (GTas) for constructing plasmid fromstandard parts (fragments and barcodes). (a) The plasmid constructionworkflow and examples of fragments and barcodes; (b) Using minimalconserved sequence (one nt long, G and T) in fragment eliminated scar inmost DNA assembly practices. Two scarless connections are shown here.(c) Mechanism of adding two partial barcodes to one fragment and ofusing two complementary partial barcodes to assemble two fragments. *indicates phosphorothioate (PS) bond; * in a circle indicates PS group;— indicates phosphodiester bond; P in a circle: phosphate group; . . .indicates omitted, unspecified nucleotides. Items in grey shade are newitems introduced in the step. (d) Statistics of 370 plasmids (P1 toP370) we have constructed by using GTas. Accuracy was based onsequencing; as an example, 80% accuracy indicates 8 out of 10 coloniescontained the correct plasmid based on Sanger sequencing of theplasmids. Each circle presents one plasmid construction. Face color ofeach circle indicates length of the plasmid (the color bar is provided,unit: kb). Each thick orange horizontal bar presents median ofaccuracies of a group (plasmids constructed from the same number offragments were included in one group). Each orange box indicates 1^(st)and 3^(rd) quantiles of accuracies of a group. Each thick bluehorizontal bar and the related error bar indicate mean and standarderror of accuracies of a group respectively. For illustration purpose,circles with the same accuracy and fragment number are distributedaround accurate values of accuracy and fragment number on the plot.

FIG. 11: Development of a scarless barcoding method. (a-b) Workflow ofbarcoding a fragment by using conventional (a) and novel (b) oligos.After the ligation, barcoded fragments were amplified by using twoAssembling oligos (Aoligos, shown as long black arrow with *) that bindbarcodes. (c-d) Analysis of PCR products from a-b by using gelelectrophoresis. White arrows indicate the desired bands, and one samplewas loaded into two lanes. The lane symbols are explained below. (e) Thethree plasmids used to compare the two barcoding methods. Each of thethree plasmids can be used to delete a gene in E. coli (nupG, manZ orglk) by using CRISPR/Cas9 technology, and was constructed from fivefragments: antibiotic resistance marker (AR), replication origin (RO),guide RNA (gRNA), upstream homologous sequence (UHS), and downstreamhomologous sequence (DHS). Five barcodes (indicated by green texts) wereused in each plasmid construction. PCR test results of the nupG plasmidare shown in c-d (those of the other two plasmids are shown in FIG. 18a). (f) Colony forming unit (CFU/μg DNA) of three plasmids constructedusing fragments that were barcoded by conventional and novel oligodesign. (g) Representative sequencing results. Two plasmids in eachplasmid construction were sequenced. All the plasmids constructed byusing novel oligo design were sequenced to be correct (6/6). Half of theplasmids constructed by using conventional oligo design were found tohave assembly errors (3/6). A pair of sequencing results from the manZplasmid construction are shown here. The other sequencing data can befound in FIG. 18b . The red arrow and text indicate the insertion to thesequence.

FIG. 12: Principles of designing and using the oligos. (a) Structure ofbarcodes. Any barcode is composed of L, SE and R. L and R can be empty.SE should be 15-20 nt depending on its melting temperature and shouldnot contain any sequence that may result in self-ligation. (b) Barcodingoligos (Boligos). There are four Boligos to encode all the variants of abarcode. The six nucleotides (NNNNNN in grey) can be any sequence aslong as the illustrated stem-loop structure can be formed. The arrow inBoligo indicates the strand on which the barcode half is encoded. Thelength of Boligo is no longer than 90 nt. (c) Each Boligo has acorresponding Aoligo (Assembling oligo). * indicates a PS bond; —indicates a phosphodiester bond. There is one more PS bond at the centerof each SE (not shown for simplicity), which is used to improve DNAassembly efficiency. (d) Instructions for connecting two fragments (f1and f2) through one barcode (B) in eight ways. Prefix “<” indicates thata part is flipped; The blue arrows below sequences indicate fragment'sorientation, while green arrows indicate barcode's orientation. The pairof Boligos used in each way is shown at the right side of each sequence.The nucleotides (G and A) used in fragment-barcode ligation are shown inbold font. (e) Demonstration of the eight connection options in a simpleexample. AR: antibiotic resistance marker; Ceul: a barcode; RO:replication origin; P: promoter; gfp: a gene encoding green fluorescenceprotein; t: terminator. Each row indicates a plasmid (A1-A8). Arrowindicates orientation of fragment/barcode. Only relevant parts of eachplasmid are illustrated. Each plasmid's performance (i.e. specificfluorescence signal) is shown in the chart. Empty circles indicatevalues of replicates. Each bar indicates the mean of triplicates at eachcondition. Error bars indicate standard error (n=3). One empty plasmid(A0) without gfp was used as a negative control. (d) Fragment oligos(Foligos) are used to amplify fragment by using PCR. Position of each PSbond is shown. Foligo should have sufficient melting temperature toallow good binding during PCR and the principle is the same to regularPCR oligo design.

FIG. 13: Construct and edit plasmids by using standard parts under GTas.(a) Structure of the 16 plasmids for improving tyrosine production in E.coli. Blue thick horizontal bars represent fragments; RO: replicationorigin (pSC101, pAC, pMB1, or pUC); P: promoter (PT7 or PLac); G1, G2:two genes in an operon (tyrA-aroG or aroG-tyrA); t: terminator; AR:antibiotic resistance marker. Green texts indicate barcodes; N21, N22,N23: non-functional connectors; RBS1: ribosome binding site; SRBS4: stopcodon and ribosome binding site; 3UTR: stop codon and untranslatedregion. (b) The biosynthetic pathway of tyrosine and coumaric acid fromglucose. Pep: Phosphoenolpyruvate; E4p: Erythrose 4-phosphate; aroG: agene encoding mutated E. coli 3-deoxy-7-phosphoheptulonate synthase;tyrA: a gene encoding mutated E. coli fused chorismate mutase/prephenatedehydrogenase; tal: a gene encoding tyrosine ammonia-lyase fromSaccharothrix espanaensis. (c) Composition of the 16 plasmids and theircorresponding tyrosine titer. TPP1-16: Tyrosine-Producing Plasmid 1-16,created by combination of RO, P and G1-G2 as indicated. Empty circlesindicate values of replicates. Each bar indicates the mean of duplicatesat each condition. (d-f) Editing a plasmid by replacing, deleting, oradding a component by using standard oligos. Orange texts indicates thecomponent changed and added. Thin black arrows indicate primers used inPCR. (h-i) Characterization of strains carrying the three plasmidsobtained from d-f in terms of tyrosine (h) and coumaric acid (i)production. Empty circles indicate values of replicates. Each barindicates the mean of at least three replicates at each condition. Errorbars indicate standard error (n>=3).

FIG. 14: Construct plasmid library for combinatorial optimization ofmicrobial strains by using standard parts. (a) Construction of twocombinatorial plasmid libraries to enhance coumaric acid production ofE. coli. Six genes involved in shikimate production (ppsA, tktA andaroE) and coumaric acid production (aroG, tyrA and tal) were dividedinto two modules. Each module has six variants resulting from shufflingthree genes. To prepare one library, six Module 1 (M1) plasmids and sixModule 2 (M2) plasmids were made. Both M1 and M2 plasmids used the samepromoter (pthrC3) and terminator (T7 terminator). Then, six M1 fragmentsand six M2 fragments were amplified from M1 and M2 plasmids by using twosets of Aoligos as indicated in black arrows (PCR results are presentedin FIG. 23a ). The amplified M1 and M2 fragments were mixed equimolarly,and assembled with a plasmid backbone (pSC101 or pAC). In library 1,pSC101 was used. In library 2, pAC was used. More details of the libraryconstruction and the quality control data are described in FIG. 23b .(b) Extended biosynthetic pathway of coumaric acid from glucose. ppsA: agene encoding of E. coli phosphoenolpyruvate synthetase; tktA: a geneencoding of E. coli transketolase; aroE: a gene encoding of E. colishikimate dehydrogenase. (c) Screening the two plasmid libraries. Themixture of plasmids was used to transform an E. coli strain(MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3). For each library, seventy-twocolonies (two times library size) were screened with coumaric acid titeras evaluation metric. Note that only a fraction of colonies from library2 can be successfully cultured in liquid medium (57/72). Each circleindicates coumaric acid titer of a strain. (d) Top 2 strains from c werecharacterized by sequencing the corresponding plasmids (data not shown).The arrangement of the genetic parts of the corresponding plasmids ispresented.

FIG. 15: The vision to transform plasmid construction practices in theglobal biotechnology community through using GTas parts. (a) A proposednetwork for users to share and/or acquire GTas parts. A user may receiveor contribute GTas parts as licensed materials (through MaterialsTransfer Agreement [MTA]) or purchase GTas parts from commercialsources. The licensed materials may be transferred through a serviceprovider (such as Addgene) to speed up execution of MTA, and can belicensed to another type of service providers that sell commercialproducts (commercial sources). The parts related to creating fragmentsare in the blue box, and those related to creating barcodes are in thegreen box. A legend to explain each symbol is provided. gDNA: genomicDNA; cDNA: complementary DNA; sDNA: synthetic DNA. (b) Usage frequencyof Boligos and Aoligos in construction of the 370 plasmids. Each circleindicates the usage frequency related to a barcode (the raw data oftimes of barcode that has been reused in construction of 370 plasmids isnot shown).

FIG. 16: GT standard enables placing the same gene into a standaloneexpression cassette or into a fusion protein without leaving a scar. (a)tal (a gene encoding tyrosine ammonia-lyase) was expressed by astandalone expression cassette containing a thrC3 promoter (anauto-inducible promoter³⁵) and T7 terminator (T7T). Met: Start codonencoding a methionine; STOP: Stop codon. Sequencing results confirmedthat the desired sequences have been successfully incorporated into theplasmid at the junction regions. Alignment between sequenced data andtemplate are as shown in the boxes on the right. The black arrowsindicate the start (ATG) and stop (TGA) codons, which were correctlyincorporated into the open reading frame of the gene. RBS1: ribosomebinding site; Shistag: stop codon and sequence encoding a sixhistidine-tag (histag). N21, N22, N23: non-function barcode; AR:Antibiotic resistance marker; RO: Replication origin; (b) tal was fusedwith a Signal Peptide⁴² (SP) via a linker which consists of Glycerine(G) and Serine (S), inherently forming a fusion protein with a histagattached to its C-terminus. Sequencing results confirmed that thepresence of two pre-designed sequences that encode the connecting aminoacid residues within the constructed plasmid (indicated by the blackarrows).

FIG. 17: To prepare conventional oligos (FIG. 11a ), two Boligos(RG-Boligo and LA-Boligo) for one barcodes halve were annealed by usingtwo complementary single strand oligos (RG-f/RG-r and LG-f/LG-r), one ofwhich was shorter than the other one by one nucleotide. All oligos usedto create conventional oligo are listed in Table 16.

FIG. 18: (a) To construct the other two plasmids of manZ-pTarget andglk-pTarget as described in FIG. 11e , the barcoded fragments wereamplified through ligation PCR. The ligation products used as templatesin ligation PCR were prepared by using conventional oligo design andnovel oligo design. The same three sets of Aoligo that have beendescribed in FIGS. 11a and 11b were used accordingly. The desired bandsare indicated by using white arrows, and one sample was loaded into twolanes. Smears were observed when conventional oligo design was used.Antibiotic resistance marker (AR), guide RNA (gRNA), upstream homologoussequence (UHS), and downstream homologous sequence (DHS). (b) Sequencingsuggested that the plasmids (nupG-pTarget and g/k-pTarget) that wereconstructed by using conventional oligos design had the undesiredinsertion (flipped partial barcode of N22) and flipped fragment (AR wasflipped). The red arrow and text indicate the problematic sequences.Sequencing results of the plasmids constructed with novel oligo designwere shown to be correct.

FIG. 19: Creating activated fragment by using Noligos. (a) The oligo tocreate the sense strand is termed as G-Noligo (the sense strand startswith G from its 5′ end). G-Noligo is the same to the sense strand (5′ to3′) sequence except it does not contain the first nucleotide (G). Theoligo to create the anti-sense strand is termed as A-Noligo (theantisense strand starts with ‘A’ from its 5′ end). A-Noligo is the samethe antisense strand (5′ to 3′) sequence except it does not contain thefirst nucleotide (A). (b) Two short fragments can be efficientlyamplified after barcoding, and ligated with two plasmid backbonesaccordingly. For each plasmid, sequencing confirmed that two plasmidsextracted from two positive colonies had the accurate sequences at thejunction region. AS (SL18-2): antisense RNA used to alter proteinexpression; SP (r_ctg): signal peptide used to create secretionexpression system in E. coli as shown in FIG. 16b . UHS: Upstreamhomologous sequence; galP: galactose promoter from S. cerevisiae; ACT1:terminator from S. cerevisiae; HIS3: selection marker from S.cerevisiael; DHS: Downstream homologous sequence; RO: E. colireplication origin; AR: E. coli antibiotic resistance marker; LacIT7p:LacI repressor expression cassette with T7 promoter; gfp: a geneencoding green fluorescence protein; T7T: T7 terminator.

FIG. 20: Improvement of accuracy of CLIVA method by adding athermophilic ligase in assembly step. (a) One fragment was amplified byusing long oligos, which is consisted of binding region of template andwith sticky-end (SE) creating region containing two phosphorothioatebonds (*). A 15 bp SE would be generated on both end of one fragmentafter chemical treatment, which can be annealed with another fragmentwith the respective complementary 15 bp SE. (b) We amplified the six andseven fragments by using above mentioned oligos, and assembled them intotwo testing plasmids. (c) Assembly efficiency and accuracy of sixfragments with adjusted molar ratio. Assembly efficiency and accuracy ofseven fragments, colony forming unit (CFU), and the results of colonyPCR and sequencing are shown accordingly. (d) The assembly accuracy ofsix fragments by using this enhanced CLIVA (eCLIVA) method wassignificantly improved against the original CLIVA²⁴ method. However, itis should be noted that the plasmid size involved in the original CLIVA(22 kbp) was much larger than the one used in eCLIVA (4.2 kbp).

FIG. 21: GTas-based library building workflow. This library containstemplates, oligos and plasmids. For preparation of activated fragments,Foligo or Noligo can be used, and the 1^(st) activation PCR would berequired if Foligo is used. For preparation of barcoded fragments,Boligo would be required for barcoding, followed by the 2^(nd)activation PCR done with the respective Aoligos. The barcoded fragmentsacquired will be assembled into plasmids. All plasmids constructed underGTas can be deposited into template library for amplifying activatedcomposite fragment through the 1^(st) activation PCR. This can bebarcoded by a new set of Boligos, and be activated through the 2^(nd)round of PCR as barcoded composite fragment. With this iterativeprocess, both flexibility of combination of activated fragments as wellas utilization of activated fragments can be maximized, this results inan ever-expanding GT standard-based library with more versatile geneticparts.

FIG. 22: Functionality test of nupG-pTarget constructed by using noveloligo design (FIG. 11e ). (a) The schematic diagram of the wild-type andthe deleted E. coli nupG locus. Black arrows indicate the oligos used tocolony PCR verification. (b) Colony PCR demonstrated that six randomlypicked colonies had the desired deletion, and the negative control (C)was provided by using wild-type E. coli cell as colony PCR template.White arrows indicate the desired bands. The oligos used for colony PCRtest are listed in Table 19.

FIG. 23: Workflow for construction of the combinatorial plasmidlibrares. (a) Construction of 12 plasmids having M1 or M2. The use of M1and M2 plasmids as PCR templates was executed, with positive resultsindicating that both M1 and M2 variates were amplified successfully bythe use of two sets of Aoligos (indicated with black arrows). (b) Six M1fragments and six M2 fragments were combinatorially assembled with oneplasmid backbone (barcoded) in one reaction to create a mixture of 36plasmids as a plasmid library. We used two plasmid backbones(pSC101+AmpR and pAC+SpecR) and created two plasmid libraries (Library 1and Library 2). To confirm the existence of two modules in the plasmidfrom library 1, two sets of colony PCR, with the use of six coloniesselected in random as template, were performed. The first colony PCR wasdone to detect the existence of the M1 by using oligos of RO1-ftargeting on RO region and ppsA-r targeting on ppsA region, and it wouldgenerate varied length of amplicons due to the fact that ppsA waspossibly arranged to the three locations. In parallel, the second colonyPCR was done specifically to detect the existence of the M2 by usingoligos of AR1-f targeting on AR region and tal-r targeting on talregion. Similarly, the quality of library 2 was verified by using thesame strategy. The amplification results proved that each plasmidlibrary had the plasmids containing varied combination of M1 and M2variants. The desired amplicon from each module of plasmid from twolibraries are highlighted by white rectangular box. (c) The expectedamplicon's length is listed accordingly for each module of plasmid fromlibrary 1 and 2.

FIG. 24: Optimization of barcoding reaction by using three ligationkits. Three microliters of three fragments with various length wereligated with 0.3 μL of RG-Boligo (5UTR) and 0.3 μL of LA-Boligo (3UTR2).This ligation of Boligos with the activated fragments was performed byusing three types of ligation kits (T4 ligase, T7 ligase and Blunt/TAligase master mix). Three sets of ligation mixtures were incubated at25° C. for 5 or 60 min. One microliter of each ligation product was usedas template in PCR by using oligos of RG-Aoligo (5UTR) and LA-Aoligo(3UTR2). We found that, after 5 min incubation, Blunt/TA ligase mastermix evidently outperformed the other ligation kits in term of theefficiency of barcoding three fragments. T4 ligase-mediated barcodingachieved similar results, as compared to Blunt/TA ligase master mixafter 60 min of incubation, however, T7 ligase cannot efficientlybarcode two shorter fragments (0.5 and 0.2 kbp) since no distinct bandswere amplified even after incubation for 60 min.

FIG. 25: Comparison of time distribution of three methods' workflow. (a)GTas-based workflow. (b) Restriction enzyme-based workflow. (c) Gibsonassembly-based workflow. Cost time of each step of three workflows iscalculated based on construction of one plasmid that is assembled withtwo fragments. We found that the time limiting steps in current cloningworks are cell culture and sequencing, instead of PCR and assembly step.

DEFINITIONS

As used herein, the term “comprising” or “including” is to beinterpreted as specifying the presence of the stated features, integers,steps or components as referred to, but does not preclude the presenceor addition of one or more features, integers, steps or components, orgroups thereof. However, in context with the present disclosure, theterm “comprising” or “including” also includes “consisting of”. Thevariations of the word “comprising”, such as “comprise” and “comprises”,and “including”, such as “include” and “includes”, have correspondinglyvaried meanings.

As used herein, a scar refers to additional nucleotide(s) left betweenjoined nucleic acid molecules after ligation. For example, the scar istypically left over from the linking sequences. It will be appreciatedthat such scar(s) may affect biological function.

As used herein a “stem-loop structure” refers to a nucleic acidsecondary structure that includes a region of nucleotides which areknown or predicted to form a double strand (stem portion) that is linkedon one side by a region of predominantly single-stranded nucleotides(loop portion). Stem-loop structures also include “hairpin” and“fold-back” structures. Such structures and terms are known in the art.The actual primary sequence of nucleotides within the stem-loopstructure is not critical as long as the secondary structure is present.As is known in the art, the secondary structure does not require exactbase-pairing. Thus, the stem may include one or more base mismatches.Alternatively, the base-pairing may not include any mismatches.

As used herein, a “tag” is a sequence of nucleic acid, called the “tagsequence,” that permits identification, recognition, and/or molecular orbiochemical manipulation of the DNA to which the tag is joined orattached. For example, a tag may provide a site for annealing a primer(i.e., a “priming site”) for DNA sequencing or nucleic acidamplification reaction. The tag sequence may comprise non-codingsequences or coding sequences. With respect to non-coding sequences, thetag sequence includes but is not limited to promoter sequences, ribosomebinding sequences, exonic sequences, regulatory sequences, terminationsequences, origin of replication sequences or part thereof. With respectto coding sequences, the tag sequences may contain a complete codingsequence or open reading frame or part thereof. Examples of codingsequence include but are not limited to antibiotic resistance genes, gfp(which encodes green fluorescent protein).

The process of joining the tag to the nucleic acid molecule is sometimesreferred to herein as “tagging” and the nucleic acid that undergoestagging is referred to as “tagged” (e.g., “tagged nucleic acid”). Thetag can comprise one or more “tag portions,” which mean herein a portionof the tag that contains a sequence for a desired intended purpose orapplication. The names and descriptions of different tag portions usedherein are for convenience, such as to make it easier to understand anddiscuss the intended purposes and applications of the different portionsof the tag in different embodiments. When a tag is used foridentification, it may be referred to as a barcode sequence. As usedherein, a barcode sequence comprises a predefined sequence that can beused to identify a nucleic acid molecule or used for assembling nucleicacid molecules. For example, the barcode sequence may be ligated/joinedto a nucleic acid molecule to tag; identify or assemble a nucleic acidmolecule.

With respect to a tag sequence serving as a linking sequence, the tagsequence may include restriction enzyme sites; for example.Alternatively, compatible cohesive overhangs may be generated in thelinking sequences using a chemical cleavage method. It will beappreciated that using a chemical cleavage method may reduce the size ofthe tag sequence since additional restriction site sequences need not beincorporated, which is an advantage. It will further be appreciated thatby carefully designing suitable tag sequences and ligating such tagsequences to different nucleic acid molecules, different nucleic acidmolecules may be assembled in any order and combination. It will beappreciated that such tag sequences may be used as the UniversalDNA-assembly Standard-BPs.

For maximum efficiency, it will be appreciated that a tag sequence mayserve several functions, such as serve as a primer binding site, alinking sequence, include a coding sequence or part thereof, anon-coding sequence or part thereof as well as a barcoding sequence. Aswill be appreciated, this also may reduce the size of the tag sequencewhich is an advantage.

In particular, linking sequences are used for the assembly of nucleicacid molecule and in joining a nucleic acid molecule tagged with alinking sequence to one or more tagged nucleic acids to form theassembly, each with a linking sequence may form a complete sequence (forexample: a complete coding sequence, a complete promoter region). Thefurther assembly method is very versatile and/or flexible and thisversatility and/or flexibility will be more fully appreciated from thisspecification.

As used herein, a “barcode sequence” refers to a unique oligonucleotidesequence used to identify a nucleic acid base and/or nucleic acidsequence tagged with the barcode sequence.

DETAILED DESCRIPTION OF THE INVENTION

According to a first aspect, the present invention provides a method forligating at least two nucleic acid molecules comprising:

-   -   (i) providing a first nucleic acid molecule comprising a first        overhang of at least one nucleotide in length at a first end;    -   (ii) providing a second nucleic acid molecule capable of forming        a stem-loop structure with an overhang of at least one        nucleotide; wherein the overhang of the second nucleic acid        molecule is substantially complementary to the first overhang of        the first end of the first nucleic acid molecule; and    -   (iii) ligating the first nucleic acid molecule to the second        nucleic acid molecule at the complementary overhangs to form a        single nucleic acid molecule.

In particular, the second nucleic acid molecule comprises a definedsequence. For example, the defined sequence of the second nucleic acidmay further comprise a tag sequence, a barcode sequence and/or a linkingsequence. It will be appreciated that the second nucleic acid moleculeis useful for tagging the first nucleic acid molecule.

For the method according to the first aspect of the invention, step (i)may comprise the steps of:

-   -   (i)(a) providing a double-stranded nucleic acid template        comprising a first nucleic acid strand and a second nucleic acid        strand substantially reverse complementary to the first nucleic        acid strand;    -   (i)(b) providing a first primer comprising a first sequence with        at least one modified nucleotide upstream of a second sequence        substantially complementary to the first strand of the nucleic        acid template; and a second primer comprising a sequence        substantially complementary to the second strand of the nucleic        acid template;    -   (i)(c) amplifying the nucleic acid template using the first and        second primers in a polymerase chain reaction to produce an        amplicon; and    -   (i)(d) chemically cleaving the amplicon to produce the first        nucleic acid molecule comprising a first overhang of at least        one nucleotide in length at a first end; or    -   (i)(a) providing a first single-stranded nucleic acid molecule;    -   (i)(b) providing a second single-stranded nucleic acid molecule        substantially complementary to the first single nucleic acid        molecule; and    -   (i)(c) allowing the first and second single-stranded nucleic        acid molecule to anneal to produce the first nucleic acid        molecule comprising a first overhang of at least one nucleotide        in length at a first end.

The number of nucleotides in the first overhang of the first nucleicacid molecule and the number of nucleotides in the overhang of thesecond nucleic acid molecule may be 1, 2 or 3. It will be appreciatedthat the number of nucleotides in the first overhang of the firstnucleic acid molecule and the overhang of the second nucleic acidmolecule may be the same number. For example, the number of nucleotidesin the first overhang of the first nucleic acid molecule is 1, and thenumber of nucleotides in the overhang of the second nucleic molecule isalso 1.

According to a second aspect, the present invention provides a methodfor ligating three nucleic acid molecules comprising:

-   -   (i) providing a first nucleic acid molecule comprising a first        overhang of at least one nucleotide in length at a first end and        a second overhang of at least one nucleotide of at least one        nucleotide in length at its other (or second) end; wherein the        first overhang and the second overhang have different sequences        and/or are not complementary to each other;    -   (ii) providing a second nucleic acid molecule capable of forming        a stem-loop structure with an overhang of at least one        nucleotide; wherein the overhang of the second nucleic acid        molecule is substantially complementary to the first overhang of        the first end of the first nucleic acid molecule; and also        providing a third nucleic acid molecule capable of forming a        stem-loop structure with an overhang of at least one nucleotide;        wherein the overhang of the third nucleic acid molecule is        substantially complementary to the second overhang of the second        end of the first nucleic acid molecule; and wherein the overhang        of the second nucleic acid molecule and the overhang of the        third nucleic acid molecule have different sequences and/or are        not complementary to each other; and    -   (iii) ligating the first overhang at the first end of the first        nucleic acid molecule to the overhang of the second nucleic acid        molecule and also the second overhang of the second end of the        first nucleic acid molecule to the overhang of the third nucleic        acid molecule to form a single nucleic acid molecule.

In particular, the second nucleic acid molecule comprises a firstdefined sequence and the third nucleic acid molecule comprises a seconddefined sequence. The first defined sequences of the second nucleic acidmolecule and the second defined sequences of the third nucleic acidmolecule may be the same sequence, substantially the same sequence ormay be different sequences. The first defined sequence of the secondnucleic acid molecule may further comprise a first tag sequence, a firstbarcode sequence and/or a first linking sequence. The second definedsequence of the third nucleic acid molecule may comprise a second tagsequence, a second barcode sequence and/or a second linking sequence. Itwill be appreciated that the second nucleic acid molecule and/or thethird nucleic acid molecules are useful for tagging the first nucleicacid molecule.

For the method according to the second aspect of the invention, step (i)may comprise the steps of:

-   -   (i)(a) providing a double-stranded nucleic acid template        comprising a first nucleic acid strand and a second nucleic acid        strand substantially reverse complementary to the first nucleic        acid strand;    -   (i)(b) providing a first primer comprising a first sequence with        at least one modified nucleotide upstream of a second sequence        substantially complementary to the second strand of the nucleic        acid template; and a second primer comprising a third sequence        with at least one modified nucleotide upstream of a fourth        sequence substantially complementary to the first strand of the        nucleic acid template;    -   (i)(c) amplifying the nucleic acid template using the first and        second primers to produce an amplicon; and    -   (i)(d) chemically cleaving the amplicon to produce the first        nucleic acid molecule comprising a first overhang of at least        one nucleotide in length at a first end and a second overhang of        at least one nucleotide of at least one nucleotide in length at        its other (or second) end; wherein the first overhang and the        second overhang have different sequences and/or are not        complementary to each other; or    -   (i)(a) providing a first single-stranded nucleic acid molecule;    -   (i)(b) providing a second single-stranded nucleic acid molecule        substantially complementary to the first single nucleic acid        molecule; and    -   (i)(c) allowing the first and second single-stranded nucleic        acid molecule to anneal to produce the first nucleic acid        molecule comprising a first overhang of at least one nucleotide        in length at a first end.

The number of nucleotides in the first overhang of the first nucleicacid and the overhang of the second nucleic acid molecule may be 1, 2 or3. It will be appreciated that the number of nucleotides in the firstoverhang of the first nucleic acid and the overhang of the secondnucleic acid molecule may be the same. For example, the number ofnucleotides in the first overhang of the first nucleic and the number ofnucleotides in the overhang of the second nucleic molecule are both 1.

Independently of the first overhang of the first nucleic acid and theoverhang of the second nucleic acid molecule, the number of nucleotidesin the second overhang of the first nucleic acid molecule and theoverhang of the third nucleic acid molecule may be 1, 2 or 3. It will beappreciated that the number of nucleotides in the second overhang of thefirst nucleic acid and the overhang of the third nucleic acid moleculemay be the same. For example, the number of nucleotides in the secondoverhang of the first nucleic and the number of nucleotides in theoverhang of the third nucleic molecule are both 1. In a particularembodiment, the number of nucleotides in the first overhang of the firstnucleic acid molecule and the number of nucleotides in the overhang ofthe second nucleic molecule are both 1 and the number of nucleotides inthe second overhang of the first nucleic acid molecule and the number ofnucleotides in the overhang of the third nucleic molecule are also both1.

In one example, the method further comprises the steps of:

iv) using the single nucleic acid molecule from step (iii) as a templatefor amplifying in a polymerase chain reaction with two amplificationprimers to produce an amplicon; and(v) performing a ligation to join the amplicon with at least one nucleicacid molecule to form an assembly comprising the ligated nucleic acidmolecules.

In particular, step (iv) may comprise amplifying the template with anamplification primer comprising at least one modified nucleotide andanother amplification primer comprising at least one modified nucleotideto produce the amplicon; chemically cleaving the amplicon to produce anend with a third overhang; and step (v) comprises ligating the ampliconto another nucleic acid molecule with an overhang substantiallycomplementary to the third overhang. The at least one other nucleic acidmolecule in step (v) may be an amplicon from step (iv) using anothernucleic acid molecule from step (iii) as a template. It will beappreciated that in one embodiment, the amplicon and the at least oneother nucleic acid molecule have different sequences.

It will be further appreciated that in another embodiment, step (v) maycomprise ligating to form an assembly of ligated nucleic moleculescomprising a concatemer of nucleic acid molecules, wherein each nucleicacid molecule of the concatemer comprises substantially the samesequence.

It will be appreciated that the assembly comprising the ligated nucleicacid molecules is circular. It will be appreciated that said circularassembly may comprise a plasmid.

In another example, the method further comprises the steps of:

-   -   (iv) using the single nucleic acid molecule from step (iii) as a        template for amplifying in a polymerase chain reaction with two        amplification primers to produce an amplicon;    -   (v) performing a ligation to join the amplicon to a plurality of        nucleic acid molecules to form an assembly of joined plurality        of nucleic acid molecules.

In particular, step (iv) comprises amplifying the template with anamplification primer comprising at least one modified nucleotide andanother amplification primer comprising at least one modifiednucleotide; further chemically cleaving the amplicon to produce a firstend with a third overhang and a second end with a fourth overhang.

More in particular, each of the plurality of nucleic acid molecules isan amplicon from step (iv) using another single nucleic acid moleculefrom step (iii) as a template.

It will be appreciated that the amplicon and each of the plurality ofnucleic acid molecules may have different sequences.

Alternatively, step (v) may comprise ligating to form a concatemer ofnucleic acid molecules, each with substantially the same sequence.

The assembly of joined plurality of nucleic acid molecules may becircular. In particular, the circular assembly of joined plurality ofnucleic acid molecules comprises a plasmid.

In an exemplfiication of the method according to the second aspect ofthe invention, the second nucleic acid molecule may comprise a firstdefined sequence and the second nucleic nucleic acid molecule maycomprise a second defned sequence. In a further exemplification, thefirst defined sequence of the second nucleic acid molecule may comprisea first tag sequence, a first barcode sequence and/or a first linkingsequence and the second defined sequence of the third nucleic acidmolecule may comprise a second tag sequence, a second barcode sequenceand/or a second linking sequence. In a further exemplification. It willbe appreciated that for these exemplifications, the method may furthercomprise the steps of:

iv) using the single nucleic acid molecule from step (iii) as a templatefor amplifying in a polymerase chain reaction with an amplificationprimer having a sequence designed based on at least part or all of thefirst defined sequence and comprising at least one modified nucleotideand another amplification primer having a sequence designed based on atleast part or all of the second defined sequence and comprising at leastone modified nucleotide to produce an amplicon, chemically cleaving theamplicon to produce a first end with a third overhang and a second endwith a fourth overhang; and

(v) performing a ligation to join the amplicon to a plurality of nucleicacid molecules to form an assembly of joined plurality of nucleic acidmolecules; wherein each of the plurality of nucleic acid molecules is anamplicon from step (iv).

It will be appreciated that each of the plurality of nucleic acidmolecules is an amplicon from step (iv) using another single nucleicacid molecule from step (iii) as a template.

It will be further appreciated that the amplification primers designedbased on the defined sequences of a said second nucleic acid molecule asapplicable and these amplification primers may be used to order and/orarrange the plurality of nucleic acid molecules in the assembly ofjoined plurality of nucleic acid molecules.

It will be appreciated that each defined sequence of said second nucleicacid molecule as applicable may comprise a tag sequence, barcodesequence and/or a linking sequence. The barcode sequence may include afirst linking sequence flanking the left side of the barcode sequenceand/or a second linking sequence flanking the right side of the barcodesequence. It will be appreciated that the left side of the barcodesequence may be considered upstream of the barcode sequence. It will beappreciated that the right side of the barcode sequence may beconsidered the downstream of the barcode sequence. The

It will be appreciated that in designing the amplification primers, anamplification primer may comprise a tag sequence comprising a firstportion of a barcode sequence and either the left or right tag sequencefrom a said corresponding second nucleic acid molecule as applicable. Itwill be appreciated that another amplification primer may comprise a tagsequence comprising a second portion of a barcode sequence from anothersaid corresponding second nucleic acid molecule as applicable. It willbe appreciated that the first portion of the barcode sequence and thesecond portion of the barcode sequence when put together may also beconsidered a barcode sequence. It will be appreciated that a said secondnucleic acid molecule comprising a stem-loop structure is used in to tagone end of a nucleic acid molecule from the plurality of nucleic acidmolecules in the assembly. It will be appreciated that another saidapplicable corresponding second nucleic acid molecule comprising astem-loop structure is used to tag one end of another nucleic acidmolecule from the plurality of nucleic acid moelcules in the assembly.It will be appreciated that the order and arrangement of the pluralityof nucleic acid molecules in the assembly may be determined fromselecting applicable second nucleic acid molecules comprising astem-loop structure for each of the plurality of nucleic acid moieculesin addition to the amplification primers. (Please refer to Example 7 andFIG. 12)

It will be appreciated that the amplicon and each of the plurality ofnucleic acid molecules may have different sequences. The assembly ofjoined plurality of nucleic acid molecules may be circular. Inparticular, the circular assembly of joined plurality of nucleic acidmolecules comprises a plasmid. It will be appreciated that the methodmay further comprise using polymerase chain reaction with amplificationprimers designed based on applicable defined sequences to implement amodification in the assembly of joined plurality of nucleic acidmolecules, wherein the modification includes inserting at least onenucleic acid molecule in to the assembly, removing t least one joinednucleic acid molecule from the assembly or replacing at least one joinednucleic acid molecule in the assembly. It will be appreciated that themodification may be implemented to form a library of different plasmids.

The present method provides a lot of flexibility and versatility in thedesign of the nucleic acid molecules capable of forming a stem-loopstructure with an overhang of at least one nucleotide. The design of thenucleic acid molecules may be computer-implemented. For any aspect ofthe invention, any of the overhangs may independently comprise anynumber of nucleotides. Overhangs that are for joining together willtypically have the same number of nucleotides. It will be appreciatedthat the overhangs may not be additional nucleotide sequences whichserve no purpose but form scars. As an illustration, the overhang may bepart of a useful coding sequence, for example. This minimizes wastage asthere are no scars. It will be appreciated that an overhang for anyaspect of the invention may be a 5′ or a 3′ overhang. For the secondaspect of the invention, the first overhang and the second overhang mayindependently be a 5′ overhang or a 3′ overhang.

It will also be appreciated that the number of nucleotides in any of theoverhangs can be reduced to as small as possible, especially forstandardized BPs. For example, the number of nucleotides in theoverhang(s) may be 1, 2 or 3. In particular, the number of nucleotidesin the overhang(s) is 1, such that the scar size is one nucleotide long.

It will be appreciated that the number of nucleotides for the firstoverhang and the overhang of the second nucleic acid molecule areindependent of the number of nucleotides for the second overhang and theoverhang of the third nucleic acid molecule.

For the second aspect of the invention, the first overhang and thesecond overhang may have different sequences and/or are notcomplementary to each other. For example, it will be appreciated thatthis helps to ensure the desired ligation occurs. If the number ofnucleotides in the first and second overhang is 1, it will beappreciated that if the first overhang is a G or a C, the secondoverhang should be an A or a T and vice versa.

Theoretically, the minimal length of a sticky end is 1 nt, and even with1 nt, there is still a large degree of freedom left in selecting aminoacid codons. For example, we can define all fragments to start with ‘G’and end with ‘T’. Then, if this is a protein-coding sequence and weintend to express it as a standalone protein, we can choose barcodes toflank the fragment by ‘ATG’ and ‘TGA’, which are start and stop codonsrespectively; if we plan to fuse this protein to others, we could selectbarcodes to flank it by ‘GGG’ and ‘TCT’, which encode flexible aminoacid glycine and serine respectively (FIG. 5). When a fragment is not aprotein-coding sequence, one usually can easily find a ‘G’ and a ‘T’ inits natural sequence to define the beginning and the end of the fragment

It will be appreciated that dephosphorylation of the 5′ end(s) ofnucleic acid molecules as appropriate may be utilised to increase theligation of the desired nucleic acid molecules in any ligation and/orassembly molecule. For example, either the first nucleic acid moleculemay be dephosphorylated or the nucleic acid molecule(s) capable offorming a stem-loop structure may be dephosphorylated. Similarly, forforming an assembly of a joined plurality of nucleic acid molecules, itis important to identify and select the nucleic acid molecules fordephosphorylation to increase the formation of the desired assembly.

It will be appreciated that chemical cleavage comprises non-enzymaticcleavage. Any suitable modified nucleotide may be utilised and anapplicable cleavage method may be used. The modified nucleotides andcleavage method as described in WO 2000/18967²⁷ may be adapted for thepresent invention.

Having now generally described the invention, the same will be morereadily understood through reference to the following examples which areprovided by way of illustration, and are not intended to be limiting ofthe present invention.

The invention includes a nucleic acid molecule comprising of a definedsequence capable of forming a stem-loop structure with an overhang ofone nucleotide. The invention includes a nucleic acid moleculecomprising of a defined sequence capable of forming a stem-loopstructure with an overhang of at least one nucleotide.

It will be appreciated that the defined sequence of a nucleic acidmolecule according to the invention may comprise a tag sequence, abarcode sequence and/or a linking sequence. The tag sequence, barcodesequence and/or linking sequence may comprise a coding region or partthereof. Alternatively, the tag sequence, barcode sequence and/orlinking sequence may comprise a non-coding region or part thereof. Itwill be appreciated that the tag sequence, barcode sequence and/orlinking sequence may include but is not limited to a ribosomal bindingstie or part thereof, a linker peptide sequence or part thereof, a 2apeptide sequence or part thereof, a protein tag sequence or partthereof, an untranslated sequence or part thereof, a promoter sequenceor part thereof, The invention also includes a kit comprising aplurality of nucleic acid molecules; each with a defined sequencecapable of forming a stem-loop structure with an overhang of at leastone nucleotide. Each defined sequence may independent comprise a tagsequence, a barcode sequence and/or a linking sequence. Said tagsequence, bar code sequence and/or linking sequence may comprise acoding region or part thereof. Alternatively, said tag sequence, barcode sequence and/or linking sequence may comprise a non-coding regionor part thereof.

The kit may further comprise one or a plurality of oligonucleotide(s).It will be appreciated that said one oligonucleotide or eacholigonucletotide from the plurality of oligonucleotides is capable ofannealing to a defined sequence of at least one of the plurality ofnucleic acid molecules.

In a particular embodiment of the kit, the kit may further comprise aplurality of oligonucleotides, wherein each oligonucleotide is capableof annealing to a defined sequence of a corresponding nucleic acidmolecule from the plurality of nucleic acid molecules in the kit.

It will be appreciated that the oligonucleotides can serve as(amplifying) primers in a polymerase chain reaction.

EXAMPLES

Standard molecular biology techniques known in the art and notspecifically described were generally followed as described in Green andSambrook, Molecular Cloning: A Laboratory Manual, Cold Springs HarborLaboratory, New York (2012)²⁶.

Example 1: Development of a Universal DNA-Assembly Standard and theEnabling Technology

For ease of understanding, we define two types of standardizedbiological parts (namely fragments and barcodes), and consider any DNAmolecule to be made of fragments and barcodes appearing in alternatingorder (FIG. 1). The difference between fragments and barcodes is theirlength—barcode is less than 60 nt and fragment is longer than 60 nt inthis first example. Both fragments and barcodes can be functional. Toassemble standardized fragments and barcodes in a pre-defined order, weneed to add one barcode onto each side of a fragment, and then use thesebarcodes to direct desired assembly of fragments (FIG. 1).

In this work, we developed a new BP standard (termed as UniversalDNA-assembly Standard, UDS, FIG. 1) or GT Assembly Standard (GTas) andthe technology needed to implement it. This method can assemble up toseven standardized BPs in any order without using any customized parts,does not leave any scar in final construct, does not need to avoid anyforbidden restriction enzyme site, and is compatible with multi-tier DNAassembly. To demonstrate the usefulness of this method, we built a smallUDS library of standardized BPs, and used it to construct a large panelof plasmids for producing isoprenoid and aromatic compounds inEscherichia coli (E. coli). These plasmids allowed us to fine tuneexpression of multiple genes and implement CRISPR-Cas9 based genomeediting. With an expanded library of UDS BPs and more participationsfrom the research community, this technology would substantially reducethe time needed, and cost incurred, in cell line development, inmetabolic engineering, synthetic biology, and other biotechnologicalapplications.

By using phosphorothioate oligos, PCR and a chemical cleavage method, wecan control length of sticky end and create 1 nt sticky ends onfragments (FIG. 2a ). Thereafter, two barcode oligos with a stem-loopstructure was joined to a DNA fragment to form a closed DNA moleculewith two stem-loop secondary structures (FIG. 2e ). This strategyeliminated the addition of tandem barcodes to the fragment, becausethere was no free end for further ligation to occur after one fragmentis flanked by two stem loops (FIG. 2e ). When we used such stem-loopoligos to barcode five different fragments, we indeed managed to amplifyall of them without any smear (FIG. 2e ), and can accurately assemblethese fragments into one functional plasmid (FIG. 2f ) by using enhancedversion of CLIVA method (Work Flow Diagram: FIG. 6, Plasmid verificationresults: FIG. 7).

For comparison, we also added barcodes to a common cloning vector (usedas an entry vector) and create 1 nt sticky ends outside barcodes (FIG.2b ), however, it was very difficult to carry out the ligation based onsuch short sticky ends, due to high Gibbs free energy. We assessed theligation efficiency by amplifying the barcoded fragment from theligation product of the fragment and the vector via PCR. Here, weattempted to barcode five fragments and we found all the PCR reactionshad low yield and non-specific products (FIG. 2c ), inferring lowligation efficiency in all cases.

For further comparison, we created barcodes by annealing two oligos,which were designed to produce the desired 1-nt SE after being annealed(FIG. 2d ). After two barcodes produced this way were incubated with afragment containing 1 nt sticky ends for a period, we attempted toamplify the barcoded fragment by using PCR (we tested the same set offragments and barcodes). The results showed that the PCR yield was muchhigher than the previous case, but many non-specific products were stillproduced as evidenced by the heavy smear in the picture (FIG. 2d ). Fromliterature, such smears were attributed to presence of barcode oligos(those used to form the barcodes) in the PCR reaction, so we removed theexcess oligos by using magnetic beads based purification beforeconducting PCR amplification, however, smears were not substantiallyreduced after this practice (data not shown).

To further understand the root cause of the heavy smear, we cloned someof the ligation products into vector and sequenced the obtained vector,whose data showed that some fragments had tandem barcodes on theirsides. So, we hypothesized that the successfully barcoded fragments(with one barcode on each side) could be further linked to barcodeoligos via blunt end like ligation, and such complex product mixturecaused formation of the non-specific products in the subsequent PCRreactions.

To further validate the robustness and general applicability of the UDSBPs and the technologies that enable it, we designed a small library ofUDS fragments and barcodes, and have used them to construct ˜50 uniqueplasmids for various biotechnological applications. The statistics ofthese plasmid construction show that we have experimentally validated200 plasmids (including replicates) and 90% of them were confirmed to becorrect (Table 1). Most of these plasmids were created by assembling 5-7fragments (Table 1), size of used fragments ranged from 53 bp to 5300bp, and the largest plasmid we have constructed was more than 10 kb(data not shown), which together have covered the commonly used rangesin biotechnological applications.

TABLE 1 Statistics of plasmid construction by using UDS biological partsMethod of Number of Molar Ratio of CFU per Line adding barcodesfragments fragments transformation Accuracy 1 Use standardized 3 Equal100-200  6/6 oligos 2 Use standardized 5 Equal 50-500 129/144 oligos 3Use standardized 6 Equal 50-500 43/48 oligos 4 Use standardized 7 Equal10-100 43/48 oligos

Example 2: Biotechnological Applications

We have completed a few applications to demonstrate the usefulness ofUDS BPs and the plasmids derived from them.

The first application demonstrated that UDS BPs can be used to constructa panel of plasmids for fine-tuning expression level of multiple genes,which has been shown to be critical in many biotechnologicalapplications. For example, in metabolic engineering, balancingexpression level of multiple genes was crucial in optimizing productionof value-added chemica. To date, constructing such plasmids requirecustomized DNA oligos for each application, which is time-consuming andexpensive. In this example, we optimized production of valencene, afragrance molecule, from glucose in E. coli by systematically changingexpression level of four genes in the biosynthetic pathway (dxs, idi,ispA and valC). Specifically, we arranged these four genes in an operonand shuffled their order, which would alter their expression level—genescloser to their promoter would be transcribed at a higher level in anoperon. In total, there would be 24 (4!) variants, and they can beeasily constructed by using five fragments (the four genes and oneexpression vector that contains T7 promoter, T7 terminator, LacIrepressor module, p5 replication origin, spectinomycin antibioticresistance), and five barcodes (RBS1, RBS2stop, RBS3stop, RBS4stop and3UTR), which encode various ribosomal binding sites with or without stopcodon and 3′ untranslated region (FIG. 3a ). We introduced each of these24 plasmids into an E. coli strain, and cultured the production strainswith three levels of IPTG concentration (0, 0.005 and 0.1 mM, IPTG:inducer of T7 promoter), which dictates the overall transcription levelof the operon. Combining IPTG concentration with the gene ordergenerated 72 conditions in total, covering a large search space, asevidenced by the fact that we detected a large range of valencene titerat these conditions (from 0.05 to 5 mg/L, FIG. 3b ). Though the producttiter is not yet impressive, this proof-of-concept study has proved thatthe UDS parts can be easily used to construct a large number of plasmidsand to tune expression level of multiple genes.

The second application focuses on customizing expression vectors byusing UDS BPs. Replication origin and promoter often need to be changedin biological applications to enable use of multiple plasmids and/or tooptimize protein expression level. In this study, we have catered 16plasmids for optimizing production of tyrosine, a useful amino acid,from glucose in E. coli (FIG. 4a ). We have combined four replicationorigins (p5, p15, pMB1 and pMB1 mutant), two promoters (T7 and lac) andtwo gene orders (tyrA-aroG or aroG-tyrA). Each of these components is astandardized fragment in our small UDS library, so construction of the16 plasmids was straightforward—six or seven fragments were assembledeach time with proper barcodes to create one plasmid in one step. Again,E. coli strains with these plasmids were shown to have various abilitiesto produce tyrosine, whose titer ranged from 0.1 to 1.4 g/L (FIG. 4b ).These data allowed us to swiftly identify the optimal condition and tolearn that only medium copy number plasmid worked the best for tyrosineproduction. This information would be very useful to further developthis strain into an industrial workhorse.

The third application focused on constructing plasmids for CRISPR-Cas9,which has revolutionized how biotechnology is being done^(11,12). Evenfor E. coli, an organism that is considered to be easy-to-manipulate,CRISPR has made inserting a large number of genes into E. coli genome tobe much easier, because CRISPR does not require marker recycle^(11,13).Based on a two-plasmid CRISPR system for E. coli ¹³, we designed someparts for genome editing of E. coli, and used them to perform two typesof editing. First, we constructed plasmids for knocking out variousgenes, and optimized the genome-editing efficiency by tuning length ofhomologous arms (Table 2). We further constructed plasmids for insertingexpression cassettes into E. coli genome via the CRISPR system. Here wedefined each gene knockout plasmid to be a new UDS BP and could combineit easily with any cassette to be inserted by using two barcodes (FIG.8). We have experimentally verified that a short 300 bp demo fragmentcould be integrated into E. coli successfully (Table 2), and plasmidscarrying larger size insertions (optimal operon) that had beenidentified from valencene and tyrosine screening are also beingconstructed, and will be tested.

TABLE 2 Characterization of CRISPR-Cas9 plasmids for E. coli genedeletion and insertion Gene Homologues arm Gene deletion insertion E.coli_Locus gRNA length (bp) efficiency efficiency nupG Optimized 500 6/66/6 nupG . . . 1000 8/8 Undergoing nupG . . . 1500 6/8 Undergoing pheA .. . 500 6/6 N/A tyrR . . . 500  3/16 N/A tyrR . . . 1000 Undergoing N/AptsH, ptsI and crr . . . 700 4/4 Undergoing melB . . . 1000 7/8Undergoing rcsB . . . 1000 8/8 Undergoing aslA . . . 500 Undergoing N/AaslA . . . 1000 Undergoing N/A N/A: Not applicable

Example 3: Materials and Methods

UDS Fragment

UDS fragment oligos with one phosphorothioate bond modified after ‘G’ offorward primer and ‘A’ on the reverse primer (FIG. 2a ) were synthesizedfrom Integrated DNA Technologies (IDT, sequence of oligos andmodifications can be found in Table 3, and all oligos used in this studyare prepared to be 100 μM). Amplify template DNA by using the designedoligos in a PCR reaction (NEB Q5 Master Mix, M0494, is recommended, andregular volume of reaction is 50 microliters). Purify the obtainedfragments by using column according to manufacture instructions, and add40 microliters nuclease-free water (1st BASE Biochemicals, BUF-1180) inelution step. Add 5.5 microliters of 1 M Tris-HCl UDS fragments andoligosabove obtained elution, and incubate the mixture at 70° C. for 5min in a 1.7 milliliter Eppendorf tube immersed in a water bath (Mix thecomponents well before the 70° C. incubation). After chemical treatment,1-mer sticky end-containing fragments were generated by cleavingspecifically at the phosphorothioate positions²³. Add 250 microliters ofnuclease-free water to the treated mixture, and if Thermo-Scientific GelExtraction Kit (K0692) is used, add 350 microliters of binding buffer,and mix and load the solution to the column, wash it by using 550microliters of wash buffer twice, and elute the solution by using 30microliters of nuclease-free water at room temperature.

TABLE 3 UDS fragments and oligos Sequence UDS (*: phosphorothioateSequence (*: fragments Forward Oligos bond) Reverse oligosphosphorothioate bond) gRNA-nupG G-gRNA-nupG F G*tacgagttaatcaatatcacagtgRNA-T R A*tctagagaattcaaaaaaag tttagagctagaaatag (SEQ ID NO: 2)(SEQ ID NO: 1) gRNA-aslA G-gRNA-aslA F G*tgcagaacttgagaaaaaaac gRNA-T RSame to gRNA-T R gttttagagctagaaatag (SEQ ID NO: 3) gRNA-melBG-gRNA-melB F G*tctaccatttgttaattatgtgtttta gRNA-T R Same to gRNA-T Rgagctagaaatag (SEQ ID NO: 4) gRNA-rcsB G-gRNA-rcsB FG*taatcacttgagcaaattgaggt gRNA-T R Same to gRNA-T R tttagagctagaaatag(SEQ ID NO: 5) gRNA-tyrR G-gRNA-tyrR F G*tttaataccgagcgttcaaaagtgRNA-T R Same to gRNA-T R tttagagctagaaatag (SEQ ID NO: 6) gRNA-pheAG-gRNA-pheA F G*ttttgagcaattcattgaaaggttt gRNA-T R Same to gRNA-T Rtagagctagaaatag (SEQ ID NO: 7) gRNA-ptsI G-gRNA-ptsI FG*tgaagttgatttctttagtatgtttta gRNA-T R Same to gRNA-T R gagctagaaatag(SEQ ID NO: 8) nupG-HF0.5 G-nupG-HF0.5 F G*ttgatcctgccagcaatanupG-HF0.5-T R A*catcgtgatgcggatgag (SEQ ID NO: 9) (SEQ ID NO: 10)nupG-HF1.0 G-nupG-HF1.0 F G*accatcgccgggacagaacc nupG-HF1.0-T RSame to nupG-HF0.5-T R (SEQ ID NO: 11) nupG-HF1.5 G-nupG-HF1.5 FG*tgcaacgtgaagcagaaggt nupG-HF1.5-T R Same to nupG-HF0.5-T R(SEQ ID NO: 12) aslA-HF0.5 G-aslA-HF0.5 F G*caccgtaaacggctctgcaslA-HF0.5-T R A*gtttcatgtcatcaaaatg (SEQ ID NO: 13) (SEQ ID NO: 14)aslA-HF1.0 G-aslA-HF1.0 F G*ccagtacgacgatcgcct aslA-HF1.0-T RSame to aslA-HF0.5-T R (SEQ ID NO: 15) melB-HF1.0 G-melB-HF1.0 FG*cccaatggcgatgaatacct melB-HF1.0-T R A*gctgttaccaacgcccgcct(SEQ ID NO: 16) (SEQ ID NO: 17) rcsB-HF1.0 G-rcsB-HF1.0 FG*gttagcgaacatgcttgcgg rcsB-HF1.0-T R A*ttgctacagcaagctcttga(SEQ ID NO: 18) (SEQ ID NO: 19) tyrR-HF0.5 G-tyrR-HF0.5 FG*cagcccgctggcgttggt tyrR-HF0.5-T R A*gtcagcacccgatattgcat(SEQ ID NO: 20) (SEQ ID NO: 21) pheA-HF0.5 G-pheA-HF0.5 FG*catgtcgcagaccgtctcg pheA-HF0.5-T R A*cgaaacgcctcccattcag(SEQ ID NO: 22) (SEQ ID NO: 23) ptsI-HF0.7 G-ptsI-HF0.7 FG*gcccgcataaaattcaggg ptsI-HF0.7-T R A*ggaactaaagtctagcctgg(SEQ ID NO: 24) (SEQ ID NO: 25) nupG-HT0.5 G-nupG-HT0.5 FG*ttacgcaaagaaaaacgg nupG-HT0.5-T R A*gccgctggttgaggtgtt (SEQ ID NO: 26)(SEQ ID NO: 27) nupG-HT1.0 G-nupG-HT1.0 F Same to G-nupG-HT0.5 FnupG-HT1.0-T R A*acgcctttatgctccatgct (SEQ ID NO: 28) nupG-HT1.5G-nupG-HT1.5 F Same to G-nupG-HT0.5 F nupG-HT1.5-T RA*gcgacgccggtctatctgga (SEQ ID NO: 29) aslA-HT0.5 G-aslA-HT0.5 FG*gccggcgctatcgctgag aslA-HT0.5-T R A*cactatgtttatccgcaa (SEQ ID NO: 30)(SEQ ID NO: 31) aslA-HT1.0 G-aslA-HT1.0 F Same to G-aslA-HT0.5 FaslA-HT1.0-T R A*gcccgcctgagatccaca (SEQ ID NO: 32) melB-HT1.0G-melB-HT1.0 F G*gtgcagtgagtgatgtgaaa melB-HT1.0-T RA*gggtatggaagctatctgga (SEQ ID NO: 33) (SEQ ID NO: 34) rcsB-HT1.0G-rcsB-HT1.0 F G*tcacctgtaggccagataag rcsB-HT1.0-T RA*attcagaaccgggaatgggc (SEQ ID NO: 35) (SEQ ID NO: 36) tyrR-HT0.5G-tyrR-HT0.5 F G*gcgcgaatatgcctgatg tyrR-HT0.5-T R A*catcccgcaggcgggtag(SEQ ID NO: 37) (SEQ ID NO: 38) pheA-HT0.5 G-pheA-HT0.5 FG*ttactggcgattgtcattcg pheA-HT0.5-T R A*aaatgggccattacaggcc(SEQ ID NO: 39) (SEQ ID NO: 40) ptsI-HT0.7 G-ptsI-HT0.7 FG*agcgcatcacttccagtac ptsI-HT0.7-T R A*taacgataagagtagggcac(SEQ ID NO: 41) (SEQ ID NO: 42) aadA G-spectR F G*tcgacctgcagaagcttspectR-T R A*cgttaagggattttggt (SEQ ID NO: 43) (SEQ ID NO: 44) blaG-ampR F G*tttctacaaactctttt G-ampR F Same to spectR-T R (SEQ ID NO: 45)P5 G-repA/p5 F G*ccgttttcatctgtgcatat repA/p5-T R A*tccttttgtaatactgcgga(SEQ ID NO: 46) (SEQ ID NO: 47) p15 G-p15A F G*tgttcagctactgacggp15A-T R A*gacatcaccgatgggga (SEQ ID NO: 48) (SEQ ID NO: 49) pMB1G-pMB1 F G*agttttcgttccactga pMB1-T R A*ggatccagcatatgcgg(SEQ ID NO: 50) (SEQ ID NO: 51) pMB1 G-pUC F G*gctcactcaaaggcggtapUC-T R A*attaccgcctttgagtga mutant (SEQ ID NO: 52) (SEQ ID NO: 53) pLacG-pLac F G*caacgcaattaatgtgagt pLac-T R A*ttgttatccgctcacaatt(SEQ ID NO: 54) (SEQ ID NO: 55) LacIT7 G-lacIT7 F G*gaaactacccataatacaagLacIT7-T R A*gaggggaattgttatccgc (SEQ ID NO: 56) tcacaattcccctatagtga(SEQ ID NO: 57) t7t G-t7t F G*ggctgctaacaaagccc t7t-T RA*ggcaccgtcaccctggat (SEQ ID NO: 58) (SEQ ID NO: 59) LacI G-lacI FSame to G-lacIT7 F lacI-T R A*tcccggacaccatcgaat (SEQ ID NO: 60) dxsG-dxs F G*agttttgatattgccaaata dxs-T R A*tgccagccaggccttg(SEQ ID NO: 61) (SEQ ID NO: 62) idi G-idi F G*caaacggaacacgtcattttidi-T R A*tttaagctgggtaaatgcag (SEQ ID NO: 63) (SEQ ID NO: 64) ispAG-ispA F G*gactttccgcagcaact ispA-T R A*tttattacgctggatgatgt(SEQ ID NO: 65) (SEQ ID NO: 66) ValC G-ValC F G*gccgagatgttcaacggValC-T R A*ggggatgatgggctcg (SEQ ID NO: 67) (SEQ ID NO: 68) tyrRG-tyrR mutant F G*gttgctgaattgaccgcatt tyrR mutant-T RA*ctggcgattgtcattcgc mutant (SEQ ID NO: 69) (SEQ ID NO: 70) aroGG-aroG mutant F G*aattatcagaacgacgattt aroG mutant-T RA*cccgcgacgcgctttta mutant (SEQ ID NO: 71) (SEQ ID NO: 72)

UDS Barcode

The sequence and annotation of UDS barcode are listed in Table 4, 3types of UDS barcodes are prepared as following procedures:

Common cloning vectors (1.8 kb) containing replication origin (pMB1) andantibiotic resistance (spectinomycin) were amplified from templateplasmid (pTarget), and barcodes were introduced during PCR by usingoligos (one phosphorothioate bond were modified after T of forwardprimer and ‘C’ on the reverse primer as shown in FIG. 2b , details canbe found in Table 5). For example, pT2-N21N22 can be amplified by usingoligos of T-N22′_pT2 F and pT2_N21″-G R. Then, 1 nt sticky ends of ‘A’and ‘G’ outside barcodes could be generated by using chemical treatment.Five cloning plasmids with 1 nt SE-based barcodes obtained here arepT2-N21N22, pT2-N22pJ23119, pT2-pJ23119N23, pT2-N23N24 and pT2-N24N21,respectively, which are available to use for barcoding five 1 ntSE-based fragments to construct nupG knock out plasmid (FIG. 2f ).

Two oligos that will be annealed to be barcodes were synthesiseddirectly (Sequence of oligos and modifications can be found in Table 5),and the desired 1 nt SE of barcodes could be generated by simply mixing50 microliters of forward and reverse oligos (Table 4 and FIG. 2d ), andannealed by following PCR program in a PCR machine: 95° C. for 2minutes, and gradually decrease temperature (0.1° C. per second) to 75°C., hold it for 2 minutes, then gradually decrease temperature (0.1° C.per second) to 4° C. For example, annealing oligos of N21s-Bff andN21s-Bfr to be prefix barcode N21-Bf, while suffix barcode N21-Br areobtained by annealing oligos of N21s-Brf and N21s-Brr. In the same way,five prefix barcodes and suffix barcodes are prepared for barcoding five1 nt SE-based fragments to construct nupG knock out plasmid (FIG. 2d ).

Stem-loop oligos used to create barcodes were synthesized directly(Sequence of barcode oligos and modifications can be found in Table 5),and 6 nucleotides loop sequence were generated randomly, and filteredsubsequently by an algorithm to remove undesired interaction with stemregion covering SE part and adjacent part of barcode. Phosphorylation of1 microliter of barcode oligo was done by using T4 Polynucleotide Kinase(NBE, M0201) (Table 6), which can be omitted by directly using thephosphorylated oligos synthesized by oligo manufacturer. Folding thebarcode oligos to be stem-loop structure was completed by identical PCRprogram used above to create annealed oligos-based barcode. For example,prefix barcode RBS1-Bf-SL will be generated after simply annealingbarcode oligos of RBS1-Bf-SL (Table 4 and FIG. 2e ).

TABLE 4 UDS Barcodes Barcodes Sequence Functionality RBS1TagaaataattttgtttaactttaagaaggagatatacatatG Ribosome binding site(SEQ ID NO: 73) RBS2stop TaaccgttcatttatcacaaaaggattgttcgatGRibosome binding site with stop codon (SEQ ID NO: 74) RBS3stopTgattcacacaggaaacagctatG Ribosome binding site with stop codon(SEQ ID NO: 75) RBS4stop TaaattaattgttcttttttcaggtgaaggttcccatGRibosome binding site with stop codon (SEQ ID NO: 76) pJ23119TtgacagctagctcagtcctaggtataatactaG Promoter (SEQ ID NO: 77) N21TtccctcgactcacacttggG Non-functional (SEQ ID NO: 78) N22TtcacacaacatagccacggG Non-functional (SEQ ID NO: 79) N23TtcaaacagaaaggccatggG Non-functional (SEQ ID NO: 80) N24TtcctcatggattctacgggG Non-functional (SEQ ID NO: 81) N31TgatgggctgaagggtttaaG Non-functional with stop codon (SEQ ID NO: 82) N32TgtcccatcacagcttacaaG Non-functional (SEQ ID NO: 83) LK3TcaggctcgtcttcttcaggG Linker used to create fusion protein(SEQ ID NO: 84)

TABLE 5 Barcode oligos Forward Oligos (CloningSequence (* phosphorothioate Reverse oligos (CloningSequence (* phosphorothioate vectors) bond) vectors) bond) T-N2′_1pT2 FT*tccctcgactcacactttcgagttcatgtgcagct pT2_N21″-G RC*ccaagtgtgagtcgaggtcagctcactcaa cc (SEQ ID NO: 85)aggcggt (SEQ ID NO: 86) T-N22′_pT2 F T*tcacacaacatagccactcgagttcatgtgcapT2_N22″-G R C*ccgtggctatgttgtgttcagctcactcaaag gctcc (SEQ ID NO: 87)gcggt (SEQ ID NO: 88) T-N23′_pT2 F T*tcaaacagaaaggccattcgagttcatgtgcapT2_N23″-G R C*ccatggcctttctgtttgtcagctcactcaaa gctcc (SEQ ID NO: 89)ggcggt (SEQ ID NO: 90) T-N24′_pT2 FT*tcctcatggattctacgtcgagttcatgtgcagct pT2_N24″-G RC*cccgtagaatccatgagtcagctcactcaa cc (SEQ ID NO: 91)aggcggt (SEQ ID NO: 92) T-pJ23119′_pT2 FT*tgacagctagctcagtcctaggtcgagttcatgt pT2_pJ23119″-G RC*tagtattatacctaggactgagctatcagct gcagctcc (SEQ ID NO: 93)cactcaaaggcggt (SEQ ID NO: 94) Forward Oligos (AnnealedSequence (/5Phos/: 5′-end Reverse oligos (AnnealedSequence (/5Phos/: 5′-end Oligos) phosphorylation) Oligos)phosphorylation) N21s-Bff cctcgactcacacttgG (SEQ ID NO: 95) N21s-Bfr/5Phos/caagtgtgagtcgagg (SEQ ID NO: 96) N22s-Bff acacaacatagccacgGN22s-Bfr /5Phos/cgtggctatgttgtgt (SEQ ID NO: 97) (SEQ ID NO: 98)N23s-Bff caaacagaaaggccatgG N23s-Bfr /5Phos/catggcctttctgtttg(SEQ ID NO: 99) (SEQ ID NO: 100) N24s-Bff ctcatggattctacggG N24s-Bfr/5Phos/ccgtagaatccatgag (SEQ ID NO: 101) (SEQ ID NO: 102) pJ23119s-BfftagctcagtcctaggtataatactaG pJ23119s-Bfr /5Phos/tagtattatacctaggactgagcta(SEQ ID NO: 103) (SEQ ID NO: 104) N21s-Brf /5Phos/ccctcgactcacacttN21s-Brr aagtgtgagtcgagggA (SEQ ID NO: 105) (SEQ ID NO: 106) N22s-Brf/5Phos/cacacaacatagccac N22s-Brr gtggctatgttgtgtgA (SEQ ID NO: 107)(SEQ ID NO: 108) N23s-Brf /5Phos/caaacagaaaggccat N23s-BrratggcctttctgtttgA (SEQ ID NO: 109) (SEQ ID NO: 110) N24s-Brf/5Phos/cctcatggattctacg N24s-Brr cgtagaatccatgaggA (SEQ ID NO: 111)(SEQ ID NO: 112) pJ23119s-Brf /5Phos/tgacagctagctcagtcctagg pJ23119s-BrrcctaggactgagctagctgtcaA (SEQ ID NO: 113) (SEQ ID NO: 114)Prefix barcode oligos  Sequence (upper case ones indicateSutttx barcode oligos  Sequence (upper case ones (Stem-loop Oligos)loop region) (Stem-loop Oligos ) indicate loop region) RBS1-Bf-SLatatgtatatctccttcttaaagttaaacaaTATG RBS1-Br-SLagaaataattttgtttaactttaagaaggTCTA TTttgtttaactttaagaaggagatatacatatGCTccttcttaaagttaaacaaaattatttctA (SEQ ID NO: 115) (SEQ ID NO: 116)RBS2stop-Bf-SL atcgaacaatccttttgtgataaatgaaTCGGT RBS2stop-Br-SLaaccgttcatttatcacaaaaggaTATCACt TttcatttatcacaaaaggattgttcgatGccttttgtgataaatgaacggttA (SEQ ID NO: 117) (SEQ ID NO: 118)RBS3stop-Bf-SL gattcacacaggaaacagctTCTTCGagc RBS3stop-Br-SLatagctgtttcctgtgtGCCTGGacacaggaa tgtttcctgtgtgaatcAacagctatG (SEQ ID NO: 119) (SEQ ID NO: 120) RBS4stop-Bf-SLaaattaattgttcttttttcaggtgaaggttcccT RBS4stop-Br-SLatgggaaccttcacctgAAGTTAcaggtgaag CACATgggaaccttcacctgaaaaaagaagttcccatG (SEQ ID NOL 121) caattaatttA (SEQ ID NO: 122) pJ23119-Bf-SLtagtattatacctaggactgagctaAGAGGGta pJ23119-Br-SLtgacagctagctcagtcctaggGCACAGc gctcagtcctaggtataatactaGctaggactgagctagctgtcaA (SEQ ID NO: 123) (SEQ ID NO: 124) N21-Bf-SLccaagtgtgagtcgaggAAGGGCcctcgact N21-Br-SL tccctcgactcacacttGCGAGAaagtgtgcacacttggG (SEQ ID NO: 125) agtcgagggaA (SEQ ID NO: 126) N22-Bf-SLccgtggctatgttgtgtCGTATTacacaacata N22-Br-SLtcacacaacatagccacTTGCTGgtggct gccacggG (SEQ ID NO: 127)atgttgtgtgaA (SEQ ID NO: 128) N23-Bf-SLccatggcctttctgtttgACCATAcaaacagaa N23-Br-SLtcaaacagaaaggccatCATTCAatggcc aggccatggG (SEQ ID NO: 129)tttctgtttgaA (SEQ ID NO: 130) N24-Bf-SL cccgtagaatccatgagAACCCGctcatggatN24-Br-SL tcctcatggattctacgACTATGcgtagaat tctacgggG (SEQ ID NO: 131)ccatgaggaA (SEQ ID NO: 132) N31-Bf-SL ttaaacccttcagcccaCACACAtgggctgaaN31-Br-SL gatgggctgaagggtttCACACAaaaccct gggtttaaG (SEQ ID NO: 133)tcagcccatcA (SEQ ID NO: 134) N32-Bf-SL ttgtaagctgtgatgggGCCTGGcccatcacaN32-Br-SL gtcccatcacagcttacGCTTCGgtaagct gcttacaaG (SEQ ID NO: 135)gtgatgggacA S(EQ ID NO: 136) LK3-Bf-SL cctgaagaagacgagccCACACAggctcgtLK3-Br-SL caggctcgtcttcttcaCACACAtgaagaa cttcttcaggG (SEQ ID NO: 137)gacgagcctgA (SEQ ID NO: 138)

TABLE 6 Phosphorylation of stem-loop barcode oligo Barcoding Oligos (100μM T4 ligase Buffer T4 Polynucleotide Kinase Nuclease-free stocksolution) (10X) (NBE: M0201) Water 1 microliters 2 microliters 0.5microliter 16.5 microliters Incubate the mixture in a PCR tube at 37° C.for 30 minutes, and 65° C. for 20 minutes for deactivation of kinase.Stem-loop folding program: incubating at 98° C. for 2 minutes, anddecrease to 75° C. (0.1° C. per second), hold it for 2 mins, thengradually decrease to 4° C. (0.1° C. per second).

Barcoding UDS Fragment

Ligation of 3 types of barcodes (cloning vector, annealed oligo andstem-loop oligo) with 1 nt SE-based UDS fragments was done at 25° C. for5 to 10 minutes by using NEB Blunt/TA Ligase Master Mix (M0367)according to manufacture instructions, and molar ratio of barcodes andUDS fragments should be between 10:1 to 3:1 to reach maximum ligationefficiency (Table 7), but for cloning vector based barcoding, the molarratio of insert and cloning vector should be 3:1. Amplify the barcodedUDS fragments in a 50 microliters PCR reaction by using 1 microliter ofthe ligation products as templates and UDS universal oligos that arecorresponded to barcodes (FIG. 2c , indicated by black arrow, andsequence of oligos and modifications can be found in (Table 8). Thebarcoded UDS fragments were then digested chemically to create 10-20 bpSE (FIG. 6), and purified for downstream assembly. Barcoded UDSfragments used in three applications are listed in in Table 9 and thesequences are shown in Table 10.

TABLE 7 Barcoding ligation Prefix Barcode Suffix Barcode 1-mer basedBlunt/TA Ligase Master Mix (5 μM) (5 μM) Fragments (2X) (NEB: M0367) 0.3microliters 0.3 microliters 3 microliters 3.6 microliters Incubate themixture in PCR tube at 25° C. for 5 to 10 minutes. This step also can bedone at room temperature and on the bench. Note: the volume of fragmentsused for barcoding depends on its concentration, usually, 10-200 ng (3μL) of fragments with size less than 3 kb should be adequate, and themolar ratio of barcodes and fragments is recommended to be 10:1 to 3:1as suggested form NEB protocol. For barcoding fragments with a size over3 kb, it is better to calculate the molar ratio of barcodes and fragmentand use higher concentration of fragments (over 100 ng/mL, the ligationvolume also can be increased accordingly). The yield of ligation PCR canbe improved by adding more ligation products (diluted 5-fold or not) ina certain degree.

TABLE 8 UDS universal oligos UDS forward Sequence (*: phosphorothioateuniversal oligos bond) RBS1-Bff ttgtttaac*tttaagaagg*agatatacatatG (SEQ ID NO: 139) RBS2stop-Bff ttcatttat*cacaaaagga*ttgttcgatG(SEQ ID NO:140) RBS3-Bff acacagga*aacagct*atG (SEQ ID NO: 141) RBS4-Bffcaggtgaa*ggttccc*atG (SEQ ID NO: 142) pJ23119-Bfftagctca*gtcctagg*tataatactaG (SEQ ID NO: 143) N21-Bffcctcgac*tcacactt*ggG (SEQ ID NO: 144) N22-Bff acacaac*atagccac*ggG(SEQ ID NO: 145) N23-Bff caaacaga*aaggccat*ggG (SEQ ID NO: 146) N24-Bffctcatgg*attctacg*ggG (SEQ ID NO: 147) N31-Bff tgggctg*aagggttt*aaG(SEQ ID NO: 148) N32-Bff cccatcac*agcttac*aaG (SEQ ID NO: 149) LK3-Bffggctcgt*cttcttca*ggG (SEQ ID NO: 150) UDS reverseSequence (*: phosphorothioate universal oligos bond) RBS1-Brrccttcttaa*agttaaacaa*aattatttctA (SEQ ID NO: 151) RBS2stop-Brrtccttttgt*gataaatgaa*cggttA (SEQ ID NO: 152) RBS3stop-Brragctgttt*cctgtgt*gaatcA (SEQ ID NO: 153) RBS4stop-Brrgggaacct*tcacctg*aaaaaagaacaatta atttA (SEQ ID NO: 154) pJ23119-Brrcctagga*ctgagcta*gctgtcaA (SEQ ID NO: 155) N21-Brr aagtgtg*agtcgagg*gaA(SEQ ID NO: 156) N22-Brr gtggcta*tgttgtgt*gaA (SEQ ID NO:157) N23-Brratggcctt*tctgtttg*aA (SEQ ID NO: 158) N24-Brr cgtagaa*tccatgag*gaA(SEQ ID NO:159) N31-Brr aaaccct*tcagccca*tcA (SEQ ID NO: 160) N32-Brrgtaagctg*tgatggg*acA (SEQ ID NO:161) LK3-Brr tgaagaa*gacgagcc*tgA(SEQ ID NO:162)

TABLE 9 Barcoded UDS fragments Barcoded UDS fragments(pJ23119)gRNA-nupG(N23) (N31)t7t(LK3) (pJ23119)gRNA-aslA(N23)(LK3)Lacl(N21) (pJ23119)gRNA-melB(N23) (RBS1)dxs(RBS2stop)(pJ23119)gRNA-rcsB(N23) (RBS2stop)idi(RBS3stop) (pJ23119)gRNA-tyrR(N23)(RBS3stop)valC(RBS4stop) (pJ23119)gRNA-pheA(N23) (RBS4stop)ispA(N31)(pJ23119)gRNA-ptsI(N23) (RBS1)idi(RBS2stop) (N23)nupG-HF0.5(N24)(RBS2stop)dxs(RBS3stop) (N23)nupG-HF1.0(N24) (RBS3stop)dxs(RBS4stop)(N23)nupG-HF1.5(N24) (RBS4stop)dxs(N31) (N23)aslA-HF0.5(N24)(RBS1)valC(RBS2stop) (N23)aslA-HF1.0(N24) (RBS2stop)valC(RBS3stop)(N23)melB-HF1.0(N24) (RBS3stop)idi(RBS4stop) (N23)rcsB-HF1.0(N24)(RBS4stop)idi(N31) (N23)tyrR-HF0.5(N24) (RBS1)ispA(RBS2stop)(N23)pheA-HF0.5(N24) (RBS2stop)ispA(RBS3stop) (N23)ptsI-HF0.7(N24)(RBS3stop)ispA(RBS4stop) (N24)nupG-HT0.5(N21) (RBS4stop)valC(N31)(N24)nupG-HT1.0(N21) (RBS1)aroG mutant(RBS4stop) (N24)nupG-HT1.5(N21)(RBS1)tyrA mutant(RBS4stop) (N24)aslA-HT0.5(N21) (RBS4stop)tyrAmutant(N31) (N24)aslA-HT1.0(N21) (RBS4stop)aroG mutant(N31)(N24)melB-HT1.0(N21) (N24)rcsB-HT1.0(N21) (N24)tyrR-HT0.5(N21)(N24)pheA-HT0.5(N21) (N24)ptsI-HT0.7(N21) (N21)aadA(N22) (N21)bla(N22)(N22)p5(N23) (N22)p15(N23) (N22)pMB1(N23) (N22)pMB1 mutant(N23(N23)pLac(RBS1) (N23)LaclT7(RBS1) (N31)t7t(N21)

TABLE 10 Sequences of barcoded fragments Barcoded UDS fragments Sequence(pJ23119)gRNA-nupG(N23)ttgacagctagctcagtcctaggtataatactagtACGAGTTAATCAATATCACAgttttagagctagaaatagcaagttaaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgctttttttgaattctctagaTtcaaacagaaaggccatggG (SEQ ID NO: 163) (pJ23119)gRNA-aslA(N23)ttgacagctagctcagtcctaggtataatactagtTCCAGATCTTCCATATATTTgttttagagctagaaatagcaagttaaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgctttttttgaattctctagaTtcaaacagaaaggccatggG (SEQ ID NO: 164) (pJ23119)gRNA-melB(N23)ttgacagctagctcagtcctaggtataatactagtCTACCATTTGTTAATTATGTgttttagagctagaaatagcaagttaaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgctttttttgaattctctagaTtcaaacagaaaggccatggG (SEQ ID NO: 165) (pJ23119)gRNA-rcsB(N23)ttgacagctagctcagtcctaggtataatactagtAATCACTTGAGCAAATTGAGgttttagagctagaaatagcaagttaaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgctttttttgaattctctagaTtcaaacagaaaggccatggG (SEQ ID NO: 166) (pJ23119)gRNA-tyrR(N23)ttgacagctagctcagtcctaggtataatactagtTTAATACCGAGCGTTCAAAAgttttagagctagaaatagcaagttaaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgctttttttgaattctctagaTtcaaacagaaaggccatggG (SEQ ID NO: 167) (pJ23119)gRNA-pheA(N23)ttgacagctagctcagtcctaggtataatactagtTTTGAGCAATTCATTGAAAGgttttagagctagaaatagcaagttaaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgctttttttgaattctctagaTtcaaacagaaaggccatggG (SEQ ID NO: 168) (pJ23119)gRNA-ptsI(N23)ttgacagctagctcagtcctaggtataatactagtGAAGTTGATTTCTTTAGTATgttttagagctagaaatagcaagttaaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgctttttttgaattctctagaTtcaaacagaaaggccatggG (SEQ ID NO: 169) (N23)nupG-HF0.5(N24)TtcaaacagaaaggccatggGttgatcctgccagcaatattgataccggcaccgcgtatctggcgatgctgaacaatgtttatctcggcggaattgataacccaacatcgcggcgttatgccgtcatcaccgcctataacggcggcgcaggcagcgtgctgcgagtcttttcgaatgataagattcaggctgccaatattattaacaccatgacgccgggcgatgtttatcagacgctgacgacccgccatccctctgcggaatctcgccgttatctttataaagtgaataccgcgcaaaaatcctaccgccgccgataattccattaaccgcccctgacgatgctcaggggcaaaaatgttatccacatcacaatttcgttttgcaaattgggaatgtttgcaattatttgccacaggtaacaaaaaaccagtccgcgaagttgatagaatcccatcatctcgcacggtcaaatgtgctttttcaaacactcatccgcatcacgatgTtcctcatggattctacgggG (SEQ ID NO: 170) (N23)nupG-HF1.0(N24)TtcaaacagaaaggccatggGaccatcgccgggacagaacctgccgcgcatttgcgccgggcaattatcaaaacgttattgatgggtgacgatccgagttcggtcgatctctattccgacgttgatgatattacgatttcgaaagaacctttcctttacggtcaggtggtggacaacaccgggcagccgattcgctgggaaggtcgcgcaagcaacttcgcggattatctgctgaaaaaccgtctgaagagccgcagcaacgggctgcgtatcatctacagcgtcaccattaacatggtgccgaaccaccttgataaacgtgcgcacaaatatctcggcatggtccgccaggcgtcacggaaatatggcgttgatgagtcgctgattctggcaattatgcagaccgaatcttcctttaacccgtatgcggtcagccgttccgatgcgctgggattaatgcaggtggtacaacatactgccgggaaagatgtgttccgctcgcaggggaaatccggcacgccgagccgcagtttcttgtttgatcctgccagcaatattgataccggcaccgcgtatctggcgatgctgaacaatgtttatctcggcggaattgataacccaacatcgcggcgttatgccgtcatcaccgcctataacggcggcgcaggcagcgtgctgcgagtcttttcgaatgataagattcaggctgccaatattattaacaccatgacgccgggcgatgtttatcagacgctgacgacccgccatccctctgcggaatctcgccgttatctttataaagtgaataccgcgcaaaaatcctaccgccgccgataattccattaaccgcccctgacgatgctcaggggcaaaaatgttatccacatcacaatttcgttttgcaaattgggaatgtttgcaattatttgccacaggtaacaaaaaaccagtccgcgaagttgatagaatcccatcatctcgcacggtcaaatgtgctttttcaaacactcatccgcatcacgatgTtcctcatggattctacgggG (SEQ ID NO: 171) (N23)nupG-HF1.5(N24)TtcaaacagaaaggccatggGtgcaacgtgaagcagaaggtcaggattttcagctgtaccccggcgagctgggaaaacgcatctataacgagatctccaaagaagcctgggcgcagtggcagcacaagcaaaccatgctgattaatgaaaagaaactcaacatgatgaatgccgagcaccgcaagctgcttgagcaggagatggtcaacttcctgttcgagggtaaagaggtgcatatcgagggctatacgccggaagataaaaaataaaaacagtgccggagcacgcctccggcaacttgcataaaaacaaacacaacacgcacccggaatgatgaaaaaatatctcgcgctggctttgattgcgccgttgctcatctcctgttcgacgaccaaaaaaggcgatacctataacgaagcctgggtcaaagataccaacggttttgatattctgatggggcaatttgcccacaatattgagaacatctggggcttcaaagaggtggtgatcgctggtcctaaggactacgtgaaatacaccgatcaatatcagacccgcagccacatcaacttcgatgacggtacgattactatcgaaaccatcgccgggacagaacctgccgcgcatttgcgccgggcaattatcaaaacgttattgatgggtgacgatccgagttcggtcgatctctattccgacgttgatgatattacgatttcgaaagaacctttcctttacggtcaggtggtggacaacaccgggcagccgattcgctgggaaggtcgcgcaagcaacttcgcggattatctgctgaaaaaccgtctgaagagccgcagcaacgggctgcgtatcatctacagcgtcaccattaacatggtgccgaaccaccttgataaacgtgcgcacaaatatctcggcatggtccgccaggcgtcacggaaatatggcgttgatgagtcgctgattctggcaattatgcagaccgaatcttcctttaacccgtatgcggtcagccgttccgatgcgctgggattaatgcaggtggtacaacatactgccgggaaagatgtgttccgctcgcaggggaaatccggcacgccgagccgcagtttcttgtttgatcctgccagcaatattgataccggcaccgcgtatctggcgatgctgaacaatgtttatctcggcggaattgataacccaacatcgcggcgttatgccgtcatcaccgcctataacggcggcgcaggcagcgtgctgcgagtcttttcgaatgataagattcaggctgccaatattattaacaccatgacgccgggcgatgtttatcagacgctgacgacccgccatccctctgcggaatctcgccgttatctttataaagtgaataccgcgcaaaaatcctaccgccgccgataattccattaaccgcccctgacgatgctcaggggcaaaaatgttatccacatcacaatttcgttttgcaaattgggaatgtttgcaattatttgccacaggtaacaaaaaaccagtccgcgaagttgatagaatcccatcatctcgcacggtcaaatgtgctttttcaaacactcatccgcatcacgatgTtcctcatggattctacgggG (SEQ ID NO: 172) (N23)aslA-HF0.5(N24)TtcaaacagaaaggccatggGcaccgtaaacggctctgcgtcattccggagtttatgaggcactaaggcgaacataagagatggaatgagcatctactcgtttattatgccacagagaatcgggaaataacatcccttaacacttgttatgagataattctgtaatcctctttgcttcctgagtaataacttcctgagtgaatatttaacctgagcttgatcctacacatacttattatgaatgataaaattcattcaattaataacacatatattaattgccgttaaaactaaaaacagcatcaataatcaacgcgatataataaacctgccttacatatcaactgcgccagaggtaggattgaaaacgctctcctgattttccaattcattttctggataaataaataatttatttttgtcactattatttatgtaatcatcctgtcagggagagggatctcaattatcaatgcttaattacgtcatcattttgatgacatgaaacTtcctcatggattctacgggG (SEQ ID NO: 173)(N23)aslA-HF1.0(N24)TtcaaacagaaaggccatggGccagtacgacgatcgcctactgctgccgattcctcgactgaaaacaaacaatccggaacagctggaaaaagtgctgcgccagcaaatcaaaaacgtcggcgatcgcccgctgttgtggagcacactgggccagtcactgatgaagcacggagaatggcaggaagcatcgctcgccttccgcgcagcgctgaaacaacgtccggacgcctacgattacgcatggcttgccgacgcgctggacagactgcacaagccggaagaagctgcagctatgcgtcgcgacggtttgatgttaacgttgcagaataacccgccacagtagttccttctcacccggaggcaagcacctccggggccttcctgatacataaaaaaacgcctgctcttattacggagcaggcgttaaaacaggtctgtatgacaacaagtgggtgcttcactcaacgttgtgtccatggtgtctgatgaggcataagcgacatctgtcagtggacgataagcaccgtaaacggctctgcgtcattccggagtttatgaggcactaaggcgaacataagagatggaatgagcatctactcgtttattatgccacagagaatcgggaaataacatcccttaacacttgttatgagataattctgtaatcctctttgcttcctgagtaataacttcctgagtgaatatttaacctgagcttgatcctacacatacttattatgaatgataaaattcattcaattaataacacatatattaattgccgttaaaactaaaaacagcatcaataatcaacgcgatataataaacctgccttacatatcaactgcgccagaggtaggattgaaaacgctctcctgattttccaattcattttctggataaataaataatttatttttgtcactattatttatgtaatcatcctgtcagggagagggatctcaattatcaatgcttaattacgtcatcattttgatgacatgaaacTtcctcatggattctacgggG (SEQ ID NO: 174)(N23)melB-HF1.0(N24)TtcaaacagaaaggccatggGcccaatggcgatgaatacctgggcgatgtatgcccgctatccgcatatcaaacaggtcgggctgtgccattcggtgcagggaacggcggaagagttggcgcgtgacctcaatatcgacccagctacgctgcgttaccgttgcgcaggtatcaaccatatggcgttttacctggagctggagcgcaaaaccgccgacggcagttatgtgaatctctacccggaactgctggcggcttatgaagcagggcaggcaccgaagccgaatattcatggcaatactcgctgccagaatattgtgcgctacgaaatgttcaaaaagctgggctatttcgtcacggaatcgtcagaacattttgctgagtacacaccgtggtttattaagccaggtcgtgaggatttgattgagcgttataaagtaccgctggatgagtacccgaaacgctgcgtcgagcagctggcgaactggcataaagagctggaggagtataaaaaagcctcccggattgatattaaaccgtcacgggaatatgccagcacaatcatgaacgctatctggactggcgagccgagtgtgatttacggcaacgtccgtaacgatggtttgattgataacctgccacaaggatgttgcgtggaagtagcctgtctggttgatgctaatggcattcagccgaccaaagtcggtacgctaccttcgcatctggccgccctgatgcaaaccaacatcaacgtacagacgctgctgaccgaagctattcttacggaaaatcgcgaccgtgtttaccacgccgcgatgatggacccgcatactgccgccgtgctgggcattgacgaaatatatgctcttgttgacgacctgattgccgcccacggcgactggctgccaggctggttgcaccgttaaaacgcgactaaacgctactgcgccgggggatttattccggcgcacacctctgacgataccaataacagaaggcgggcgttggtaacagcTtcctcatggattctacgggG (SEQ ID NO: 175) (N23)rcsB-HF1.0(N24)TtcaaacagaaaggccatggGgttagcgaacatgcttgcggatgatagctggaaaagtgagacggtgctgttctccgtgcaggatttaattgatgaagttgtgccttcagtgttgcctgccatcaagcgtaaaggtctgcaactgctgattaacaatcatctgaaagcacacgatatgcgccgcggcgatcgcgatgccttacgacgtattttgctgctactgatgcaatatgccgtgacctcaacgcaattgggaaaaatcacccttgaggttgatcaggatgagtcctccgaagaccgcctgacgttccgcattctggacacgggagaaggcgtaagtattcatgaaatggataatttgcacttcccgtttatcaaccagacccaaaacgatcgctatggcaaggcggacccgctggcattctggctgagcgatcaactggcacgtaaactgggcggtcatttaaacatcaaaacgcgggatgggcttggtacacgctactctgtgcatatcaaaatgctcgcagctgacccggaagttgaagaggaagaagagcgtttactggatgatgtctgcgtaatggtggatgttacttcggcagaaattcggaatattgtcactcgccagttagaaaattggggtgcaacctgtatcacacccgatgaaagattaattagtcaagattatgatatctttttaacggataatccgtctaatcttactgcctctggcttgcttttaagcgatgatgagtctggcgtacgggaaattgggcctggtcaattgtgcgtcaacttcaatatgagcaacgctatgcaggaagcggtcttacaattaattgaagtgcaactggcgcaggaagaggtgacagaatcgcctctgggcggagatgaaaatgcgcaactccatgccagcggctattatgcgctctttgtagacacagtaccggatgatgttaagaggctgtatactgaagcagcaaccagtgactttgctgcgttagcccaaacggctcatcgtcttaaaggcgtatttgccatgctaaatctggtacccggcaagcagttatgtgaaacgctggaacatctgattcgtgagaaggatgttccaggaatagaaaaatacatcagcgacattgacagttatgtcaagagcttgctgtagcaaTtcctcatggattctacgggG (SEQ ID NO: 176)(N23)tyrR-HF0.5(N24)CagcccgctggcgttggtcgatatggcgtttatcgcctggcgcaatctgcgtttaattaatcgcatcgccacgctgtatggcattgaactggggtattacagccgtttgcgtctgtttaagctggtattgctgaatatcgcttttgccggagccagcgaactggtgcgcgaagtggggatggactggatgtcgcaagatctcgctgctcgtttgtctacccgcgcagctcaggggattggtgcaggacttctgacggcacgactcgggattaaagctatggagctttgccgcccgctgccgtggattgacgatgacaaacctcgcctcggggatttccgtcgtcagcttatcggtcaggtgaaagaaacgctgcaaaaaggcaaaacgcccagcgaaaaataatgcaatatcgggtgctgacTtcctcatggattctacgggG (SEQ ID NO: 177)(N23)pheA-HF0.5(N24)TtcaaacagaaaggccatggGcatgtcgcagaccgtctcgccaaactggaaaaatggcaaacacatctgattaatccacatatcattctgtccaaagagccacaagggtttgttgctgacgccacaatcaatacacctaacggcgttctggttgccagtggtaaacatgaagatatgtacaccgcaattaacgaattgatcaacaagctggaacggcagctcaataaactgcagcacaaaggcgaagcacgtcgtgccgcaacatcggtgaaagacgccaacttcgtcgaagaagttgaagaagagtagtcctttatattgagtgtatcgccaacgcgccttcgggcgcgttttttgttgacagcgtgaaaacagtacgggtactgtactaaagtcacttaaggaaacaaacatgaaacacataccgtttttcttcgcattcttttttaccttcccctgaatgggaggcgtttcgTtcctcatggattctacgggG (SEQ ID NO: 178)(N23)ptsI-HF0.7(N24)TtcaaacagaaaggccatggGcatgtcgcagaccgtctcgccaaactggaaaaatggcaaacacatctgattaatccacatatcattctgtccaaagagccacaagggtttgttgctgacgccacaatcaatacacctaacggcgttctggttgccagtggtaaacatgaagatatgtacaccgcaattaacgaattgatcaacaagctggaacggcagctcaataaactgcagcacaaaggcgaagcacgtcgtgccgcaacatcggtgaaagacgccaacttcgtcgaagaagttgaagaagagtagtcctttatattgagtgtatcgccaacgcgccttcgggcgcgttttttgttgacagcgtgaaaacagtacgggtactgtactaaagtcacttaaggaaacaaacatgaaacacataccgtttttcttcgcattcttttttaccttcccctgaatgggaggcgtttcgTtcctcatggattctacgggG (SEQ ID NO: 179)(N24)nupG-HT0.5(N21)TtcctcatggattctacgggGttacgcaaagaaaaacgggtcgccagaaggtgacccgttttttttattcttacttcaacacataaccgtacaaccgtttcacgccatccgcatcggtttcgctataaacaccttgcagctccggcgaaaatcccggcaacaaattcaccccttcttccagtgcaaggaaataacgttgaaccgccccaccccagacttccccgggtaccacgcaaagcacgccaggtggataaggcaacgccccttctgccgcaattcgcccttcggcatcacgaatccgcaccaactccacgtcaccgcgaatataagcgctatgcgcatcctgggggttcatcaccactgacgggaaactctgctggcggaacatcgctttttgtaggtctttgacgtcgaaactgacatacagatcgtgcatctcctgacacaactggcgcagggtgtagtcgcgatagcgcaccggatacttgttataaacgctcggcaacacctcaaccagcggcTtccctcgactcacacttggG (SEQ ID NO: 180) (N24)nupG-HT1.0(N21)TtcctcatggattctacgggGttacgcaaagaaaaacgggtcgccagaaggtgacccgttttttttattcttacttcaacacataaccgtacaaccgtttcacgccatccgcatcggtttcgctataaacaccttgcagctccggcgaaaatcccggcaacaaattcaccccttcttccagtgcaaggaaataacgttgaaccgccccaccccagacttccccgggtaccacgcaaagcacgccaggtggataaggcaacgccccttctgccgcaattcgcccttcggcatcacgaatccgcaccaactccacgtcaccgcgaatataagcgctatgcgcatcctgggggttcatcaccactgacgggaaactctgctggcggaacatcgctttttgtaggtctttgacgtcgaaactgacatacagatcgtgcatctcctgacacaactggcgcagggtgtagtcgcgatagcgcaccggatacttgttataaacgctcggcaacacctcaaccagcggcgagtcatcctcaatatgctgttcaaattgcgccagcatcgccaccagttgtgccagcttctcgtggctttccgccggagttaataaaaacagaatggagttgagatcgcacttctccggcacaatgccgttctcacgcagatagtgcgccagaatcgtcgccggaacgccaaagtcgctatattcgccggtttcggcatcgatacctggtgtagtgagtaacagcttgcacggatcaacaaaatactgatccgcggcatatccttcaaagccgtgccacttcgcccccggctcaaaactgaaaaaacggcggtcgctggctaacactgatgtcggataatcctgccacaatttgccatcaacaacgggcgggataaacgggcggaacagcttacagcgcgcaagaatagccttgcgcgcttcaatccctatctcaacacactcagcccacagccgacgcccactctccccttcatgaattttggcgttaacatccagtgcagcaaacagcggatagaaagggctggtagaagcatggagcataaaggcgtTtccctcgactcacacttggG (SEQ ID NO: 181) (N24)nupG-HT1.5(N21)TtcctcatggattctacgggGttacgcaaagaaaaacgggtcgccagaaggtgacccgttttttttattcttacttcaacacataaccgtacaaccgtttcacgccatccgcatcggtttcgctataaacaccttgcagctccggcgaaaatcccggcaacaaattcaccccttcttccagtgcaaggaaataacgttgaaccgccccaccccagacttccccgggtaccacgcaaagcacgccaggtggataaggcaacgccccttctgccgcaattcgcccttcggcatcacgaatccgcaccaactccacgtcaccgcgaatataagcgctatgcgcatcctgggggttcatcaccactgacgggaaactctgctggcggaacatcgctttttgtaggtctttgacgtcgaaactgacatacagatcgtgcatctcctgacacaactggcgcagggtgtagtcgcgatagcgcaccggatacttgttataaacgctcggcaacacctcaaccagcggcgagtcatcctcaatatgctgttcaaattgcgccagcatcgccaccagttgtgccagcttctcgtggctttccgccggagttaataaaaacagaatggagttgagatcgcacttctccggcacaatgccgttctcacgcagatagtgcgccagaatcgtcgccggaacgccaaagtcgctatattcgccggtttcggcatcgatacctggtgtagtgagtaacagcttgcacggatcaacaaaatactgatccgcggcatatccttcaaagccgtgccacttcgcccccggctcaaaactgaaaaaacggcggtcgctggctaacactgatgtcggataatcctgccacaatttgccatcaacaacgggcgggataaacgggcggaacagcttacagcgcgcaagaatagccttgcgcgcttcaatccctatctcaacacactcagcccacagccgacgcccactctccccttcatgaattttggcgttaacatccagtgcagcaaacagcggatagaaagggctggtagaagcatggagcataaaggcgttattcaaccgcttatgcgggcaaaaacgcgcctgtccgcggatatggttatcttttttatggatctgcgacgtctgtgagaatcccgcctgctgtttgtgcaccgactgagtcacaaagatccccggatcgttttcgttaagttctaacagcagcggcgagctatccgccatcatcgggataaattgttcataaccgacccacgcggaatcaaacagaatgtaatcacacagatgcccaacggtatcgatcacctgacgggcgttatagacagtgccgtcataggttcccagctgaataatcgccaggcgatacgggcgcggcaggtcggctttttctggcgcaacgtcgcgaatttgctggcgcagatactcttcattaaaacagtgcgcatcaataccgccaatgaaaccaaacgggttgcgtgaagcttccagatagaccggcgtcgcTtccctcgactcacacttggG (SEQ ID NO:182) (N24)aslA-HT0.5(N21)TtcctcatggattctacgggGgccggcgctatcgctgagatctgcctttgccggatgcgatgctgacgcatcttatccagcctacagaacgctgcaatttattgaatttgcacgatcatgtaggccggataaggcgtttacgccgcatccggcaatcaaccgcaggcggccgccgatttctacttactcaccaccagcaaatgcgcatgcataatgtcgctggccgggcgaccgtgcgccagcaaatcagccattgctttaagatatggcggtagatggcggaaataacgctgatacccggcacacaaataattcagtcccggtttgccgctggcatcgagcatgaagcggtgtttcgggcagcctccccagcacgcttttaacacgttacaactgcgacactgcgccggtaactgcttaaatttatcttcaccaaacgcctgctgttgcggggaatcgatcatttctgcaattgtttgctggtgcatattccccagccgatattgcggataaacatagtgTtccctcgactcacacttggG (SEQ ID NO: 183) (N24)aslA-HT1.0(N21)TtcctcatggattctacgggGcggtaaactcgctgctgtgcgtatggatgagttcaagtatcacgtcctgattcagcaaccttacgcttatacccagagcggatatcagggtggattcaccggcacagtaatgcaaacggcgggatcgtcggtgtttaacctctacaccgatccgcaggaaagcgactccatcggcgtgcgccatattccgatgggtgtaccgctacagaccgaaatgcacgcgtatatggagatcctgaaaaaatatccaccacgcgcgcagattaaatctgactaagccggcgctatcgctgagatctgcctttgccggatgcgatgctgacgcatcttatccagcctacagaacgctgcaatttattgaatttgcacgatcatgtaggccggataaggcgtttacgccgcatccggcaatcaaccgcaggcggccgccgatttctacttactcaccaccagcaaatgcgcatgcataatgtcgctggccgggcgaccgtgcgccagcaaatcagccattgctttaagatatggcggtagatggcggaaataacgctgatacccggcacacaaataattcagtcccggtttgccgctggcatcgagcatgaagcggtgtttcgggcagcctccccagcacgcttttaacacgttacaactgcgacactgcgccggtaactgcttaaatttatcttcaccaaacgcctgctgttgcggggaatcgatcatttctgcaattgtttgctggtgcatattccccagccgatattgcggataaacatagtgatcacaggcgtaaacgtcgccgttgtgctcaacaatcaccgagcgcccacaggttggctgatgatggcaaaccgcacccggcgcaccgacaaaattggcaaacgcccattcgatattcatcacgaaaatcttgccgacgtcgcgtttgatccagtggtcgaatatcgccaccagaaactcaccgaactcctcggggcgcaccgaccattccgttagctcacccTtccctcgactcacacttggG (SEQ ID NO:184) (N24)melB-HT1.0(N21)TtcctcatggattctacgggGttacgcaaagaaaaacgggtcgccagaaggtgacccgttttttttattcttacttcaacacataaccgtacaaccgtttcacgccatccgcatcggtttcgctataaacaccttgcagctccggcgaaaatcccggcaacaaattcaccccttcttccagtgcaaggaaataacgttgaaccgccccaccccagacttccccgggtaccacgcaaagcacgccaggtggataaggcaacgccccttctgccgcaattcgcccttcggcatcacgaatccgcaccaactccacgtcaccgcgaatataagcgctatgcgcatcctgggggttcatcaccactgacgggaaactctgctggcggaacatcgctttttgtaggtctttgacgtcgaaactgacatacagatcgtgcatctcctgacacaactggcgcagggtgtagtcgcgatagcgcaccggatacttgttataaacgctcggcaacacctcaaccagcggcgagtcatcctcaatatgctgttcaaattgcgccagcatcgccaccagttgtgccagcttctcgtggctttccgccggagttaataaaaacagaatggagttgagatcgcacttctccggcacaatgccgttctcacgcagatagtgcgccagaatcgtcgccggaacgccaaagtcgctatattcgccggtttcggcatcgatacctggtgtagtgagtaacagcttgcacggatcaacaaaatactgatccgcggcatatccttcaaagccgtgccacttcgcccccggctcaaaactgaaaaaacggcggtcgctggctaacactgatgtcggataatcctgccacaatttgccatcaacaacgggcgggataaacgggcggaacagcttacagcgcgcaagaatagccttgcgcgcttcaatccctatctcaacacactcagcccacagccgacgcccactctccccttcatgaattttggcgttaacatccagtgcagcaaacagcggatagaaagggctggtagaagcatggagcataaaggcgttattcaaccgcttatgcgggcaaaaacgcgcctgtccgcggatatggttatcttttttatggatctgcgacgtctgtgagaatcccgcctgctgtttgtgcaccgactgagtcacaaagatccccggatcgttttcgttaagttctaacagcagcggcgagctatccgccatcatcgggataaattgttcataaccgacccacgcggaatcaaacagaatgtaatcacacagatgcccaacggtatcgatcacctgacgggcgttatagacagtgccgtcataggttcccagctgaataatcgccaggcgatacgggcgcggcaggtcggctttttctggcgcaacgtcgcgaatttgctggcgcagatactcttcattaaaacagtgcgcatcaataccgccaatgaaaccaaacgggttgcgtgaagcttccagatagaccggcgtcgcTtccctcgactcacacttggG (SEQ ID NO: 185) (N24)rcsB-HT1.0(N21)TtcctcatggattctacgggGgccggcgctatcgctgagatctgcctttgccggatgcgatgctgacgcatcttatccagcctacagaacgctgcaatttattgaatttgcacgatcatgtaggccggataaggcgtttacgccgcatccggcaatcaaccgcaggcggccgccgatttctacttactcaccaccagcaaatgcgcatgcataatgtcgctggccgggcgaccgtgcgccagcaaatcagccattgctttaagatatggcggtagatggcggaaataacgctgatacccggcacacaaataattcagtcccggtttgccgctggcatcgagcatgaagcggtgtttcgggcagcctccccagcacgcttttaacacgttacaactgcgacactgcgccggtaactgcttaaatttatcttcaccaaacgcctgctgttgcggggaatcgatcatttctgcaattgtttgctggtgcatattccccagccgatattgcggataaacatagtgTtccctcgactcacacttggG (SEQ ID NO: 186) (N24)tyrR-HT0.5(N21)TtcctcatggattctacgggGttatgctttcagtacagccagagctgcttcgtaatccggctcggtggtgatttcatccaccagctggctgaaaatcacattgtcattttcgtcaataaccacaacggcacgcgctgccagacctttcagtgggccatcagcaattgccacaccgtaagcttgcagaaattcagcgttacggaaagtggagagggtgataacgttgttcagaccttctgcgccgcagaaacgagactgggcgaacggcagatcggcagagatacacagcacaacggtgttgtcgatctcagttgccagttggttaaacttacgtactgatgcggcgcaaacaccggtatcaatactcgggaaaatgttcagcactttgcgtttacccgcaaactgaccgagggtgacgtcagacagatcttttgccacgagagtaaaagtctgcgctttgctacccgcctgcgggatggaattggcgactgtaaccgggttgccctggaaatgaacggtttgtgacatTtccctcgactcacacttggG (SEQ ID NO: 187)(N24)pheA-HT0.5(N21)TtcctcatggattctacgggGttactggcgattgtcattcgcctgacgcaataacacgcggctttcactctgaaaacgctgtgcgtaatcgccgaaccagtgctccaccttgcggaaactgtcaataaacgcctgcttatcgccctgctccagcaactcaatcgcctcgccgaaacgcttatagtaacgtttgattaacgccagattacgctctgacgacataatgatgtcggcataaagctgcggatcctgagcaaacagtcgcccgaccatcgccagctcaaggcggtaaatcggcgaagagagcgccagaagttgctcaagctgaacattttcttctgccaggtgcagcccgtaagcaaaagtagcaaagtggcgcagtgcctgaataaacgccatattctgatcgtgctcgacggcgctaatacgatgcagccgagcgccccagacctgaatttgctccagaaaccattggtatgcttccggtttacgtccatcacaccagaccacaacttgctttgccaggctaccgctgtccggaccgaacatcgggtgtagccccagcaccggaccatcatgcgccaccagcatggcctgtaatggcccatttTtccctcgactcacacttggG (SEQ ID NO: 188) (N24)ptsI-HT0.7(N21)TtcctcatggattctacgggGagcgcatcacttccagtacgcgcaaccccgctcggtgcactgcatcggttaacgccttccctttcagcaagccactgatgagctgagcacaaaacaggtcgccagtccctttcaggtcggtttttacccgtgaatgggaaatgacattcacgctgtcggcagtgaccaccacaacctgcatctcctgattttcttcattaccggaggcgctggtaaccaccacccattttaatgtgtctgaaagcagactttttgcggcagcaatggcactgtcgagatcgcggcaatttttaccggtcaggatttccaactcaaagatattgggggtaattccctgcgccagcggcagtaaatattgtcgatacgcttcgggaaggtcaggtttgacataaattccgctatcaatatcgccaatcaccggatcgaccatgatcaataggtcaggatggtctttgcgtagcgcagtcagccactcggcaaggattttgatttgcgatgccgttcccatatagcccgtggttacagcacgaagttggcgcagcgcatcacgctcctgaagcgcacgcaaatagccgctaaaccattcgtccggaatcgcaccaccgtagaaagtgtcataatgcggcgtattgctcagcaataccgtcggcacggcaaagacattcaggccgttctgtttgatagcaggcacggcaatgctgttgcccacgctgccgtaaaccacctgcgactgcacggcgacgatatccgcctgcagtgccctactcttatcgttaTtccctcgactcacacttggG (SEQ ID NO: 189) (N21)aadA(N22)TtccctcgactcacacttggGtcgacctgcagaagcttagatctattaccctgttatccctactcgagttcatgtgcagctccatcagcaaaaggggatgataagtttatcaccaccgactatttgcaacagtgccgttgatcgtgctatgatcgactgatgtcatcagcggtggagtgcaatgtcatgagggaagcggtgatcgccgaagtatcgactcaactatcagaggtagttggcgtcatcgagcgccatctcgaaccgacgttgctggccgtacatttgtacggctccgcagtggatggcggcctgaagccacacagtgatattgatttgctggttacggtgaccgtaaggcttgatgaaacaacgcggcgagctttgatcaacgaccttttggaaacttcggcttcccctggagagagcgagattctccgcgctgtagaagtcaccattgttgtgcacgacgacatcattccgtggcgttatccagctaagcgcgaactgcaatttggagaatggcagcgcaatgacattcttgcaggtatcttcgagccagccacgatcgacattgatctggctatcttgctgacaaaagcaagagaacatagcgttgccttggtaggtccagcggcggaggaactctttgatccggttcctgaacaggatctatttgaggcgctaaatgaaaccttaacgctatggaactcgccgcccgactgggctggcgatgagcgaaatgtagtgcttacgttgtcccgcatttggtacagcgcagtaaccggcaaaatcgcgccgaaggatgtcgctgccgactgggcaatggagcgcctgccggcccagtatcagcccgtcatacttgaagctagacaggcttatcttggacaagaagaagatcgcttggcctcgcgcgcagatcagttggaagaatttgtccactacgtgaaaggcgagatcaccaaggtagtcggcaaataagatgccgctcgccagtcgattggctgagctcatgaagttcctattccgaagttccgcgaacgcgtaaaggatctaggtgaagatcctttttgataatctcatgaccaaaatcccttaacgTtcacacaacatagccacggG (SEQ ID NO: 190) (N21)bla(N22)TtccctcgactcacacttggGtttctacaaactcttttgtttatttttctaaatacattcaaatatgtatccgctcatgagacaataaccctgataaatgcttcaataatattgaaaaaggaagagtatgagtattcaacatttccgtgtcgcccttattcccttttttgcggcattttgccttcctgtttttgctcacccagaaacgctggtgaaagtaaaagatgctgaagatcagttgggtgcacgagtgggttacatcgaactggatctcaacagcggtaagatccttgagagttttcgccccgaagaacgttttccaatgatgagcacttttaaagttctgctatgtggcgcggtattatcccgtgttgacgccgggcaagagcaactcggtcgccgcatacactattctcagaatgacttggttgagtactcaccagtcacagaaaagcatcttacggatggcatgacagtaagagaattatgcagtgctgccataaccatgagtgataacactgcggccaacttacttctgacaacgatcggaggaccgaaggagctaaccgcttttttgcacaacatgggggatcatgtaactcgccttgatcgttgggaaccggagctgaatgaagccataccaaacgacgagcgtgacaccacgatgcctgtagcaatggcaacaacgttgcgcaaactattaactggcgaactacttactctagcttcccggcaacaattaatagactggatggaggcggataaagttgcaggaccacttctgcgctcggcccttccggctggctggtttattgctgataaatctggagccggtgagcgtgggtctcgcggtatcattgcagcactggggccagatggtaagccctcccgtatcgtagttatctacacgacggggagtcaggcaactatggatgaacgaaatagacagatcgctgagataggtgcctcactgattaagcattggtaactgtcagaccaagtttactcatatatactttagattgatttaaaacttcatttttaatttaaaaggatctaggtgaagatcctttttgataatctcatgaccaaaatcccttaacgTtcacacaacatagccacggG (SEQ ID NO: 191) (N22)p5(N23)TtcacacaacatagccacggGccgttttcatctgtgcatatggacagttttccctttgatatgtaacggtgaacagttgttctacttttgtttgttagtcttgatgcttcactgatagatacaagagccataagaacctcagatccttccgtatttagccagtatgttctctagtgtggttcgttgtttttgcgtgagccatgagaacgaaccattgagatcatacttactttgcatgtcactcaaaaattttgcctcaaaactggtgagctgaatttttgcagttaaagcatcgtgtagtgtttttcttagtccgttatgtaggtaggaatctgatgtaatggttgttggtattttgtcaccattcatttttatctggttgttctcaagttcggttacgagatccatttgtctatctagttcaacttggaaaatcaacgtatcagtcgggcggcctcgcttatcaaccaccaatttcatattgctgtaagtgtttaaatctttacttattggtttcaaaacccattggttaagccttttaaactcatggtagttattttcaagcattaacatgaacttaaattcatcaaggctaatctctatatttgccttgtgagttttcttttgtgttagttcttttaataaccactcataaatcctcatagagtatttgttttcaaaagacttaacatgttccagattatattttatgaatttttttaactggaaaagataaggcaatatctcttcactaaaaactaattctaatttttcgcttgagaacttggcatagtttgtccactggaaaatctcaaagcctttaaccaaaggattcctgatttccacagttctcgtcatcagctctctggttgctttagctaatacaccataagcattttccctactgatgttcatcatctgagcgtattggttataagtgaacgataccgtccgttctttccttgtagggttttcaatcgtggggttgagtagtgccacacagcataaaattagcttggtttcatgctccgttaagtcatagcgactaatcgctagttcatttgctttgaaaacaactaattcagacatacatctcaattggtctaggtgattttaatcactataccaattgagatgggctagtcaatgataattactagtccttttcctttgagttgtgggtatctgtaaattctgctagacctttgctggaaaacttgtaaattctgctagaccctctgtaaattccgctagacctttgtgtgttttttttgtttatattcaagtggttataatttatagaataaagaaagaataaaaaaagataaaaagaatagatcccagccctgtgtataactcactactttagtcagttccgcagtattacaaaaggaTtcaaacagaaaggccatgg (SEQ ID NO: 192) (N22)p15(N23)TtcacacaacatagccacggGtgttcagctactgacggggtggtgcgtaacggcaaaagcaccgccggacatcagcgctagcggagtgtatactggcttactatgttggcactgatgagggtgtcagtgaagtgcttcatgtggcaggagaaaaaaggctgcaccggtgcgtcagcagaatatgtgatacaggatatattccgcttcctcgctcactgactcgctacgctcggtcgttcgactgcggcgagcggaaatggcttacgaacggggcggagatttcctggaagatgccaggaagatacttaacagggaagtgagagggccgcggcaaagccgtttttccataggctccgcccccctgacaagcatcacgaaatctgacgctcaaatcagtggtggcgaaacccgacaggactataaagataccaggcgtttccccctggcggctccctcgtgcgctctcctgttcctgcctttcggtttaccggtgtcattccgctgttatggccgcgtttgtctcattccacgcctgacactcagttccgggtaggcagttcgctccaagctggactgtatgcacgaaccccccgttcagtccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggaaagacatgcaaaagcaccactggcagcagccactggtaattgatttagaggagttagtcttgaagtcatgcgccggttaaggctaaactgaaaggacaagttttggtgactgcgctcctccaagccagttacctcggttcaaagagttggtagctcagagaaccttcgaaaaaccgccctgcaaggcggttttttcgttttcagagcaagagattacgcgcagaccaaaacgatctcaagaagatcatcttattaatcagataaaatatttctagatttcagtgcaatttatctcttcaaatgtagcacctgaagtcagccccatacgatataagttgtaattctcatgtttgacagcttatcacccagtcctgctcgcttcgctacttggccatacccacgccgaaacaagcgctcatgagcccgaagtggcgagcccgatcttccccatcggtgatgtcTtcaaacagaaaggccatggG (SEQ ID NO: 193)(N22)pMB1(N23) TtcacacaacatagccacggGagttttcgttccactgagcgtcagaccccgtagaaaagatcaaaggatcttcttgagatcctttttttctgcgcgtaatctgctgcttgcaaacaaaaaaaccaccgctaccagcggtggtttgtttgccggatcaagagctaccaactctttttccgaaggtaactggcttcagcagagcgcagataccaaatactgtccttctagtgtagccgtagttaggccaccacttcaagaactctgtagcaccgcctacatacctcgctctgctaatcctgttaccagtggctgctgccagtggcgataagtcgtgtcttaccgggttggactcaagacgatagttaccggataaggcgcagcggtcgggctgaacggggggttcgtgcacacagcccagcttggagcgaacgacctacaccgaactgagatacctacagcgtgagctatgagaaagcgccacgcttcccgaagggagaaaggcggacaggtatccggtaagcggcagggtcggaacaggagagcgcacgagggagcttccagggggaaacgcctggtatctttatagtcctgtcgggtttcgccacctctgacttgagcgtcgatttttgtgatgctcgtcaggggggcggagcctatggaaaaacgccagcaacgcggcctttttacggttcctggccttttgctggccttttgctcacatgttctttcctgcgttatcccctgattctgtggataaccgtattaccgcctttgagtgagctgataccgctcgccgcagccgaacgaccgagcgcagcgagtcagtgagcgaggaagcggaagagcgcctgatgcggtattttctccttacgcatctgtgcggtatttcacaccgcatatgctggatccttgacagctagctcagtcctaggtataatactag (SEQ ID NO: 194) (N22)pMB1 mutant(N23)TtcacacaacatagccacggGgctcactcaaaggcggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcatagctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaagaacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagattatcaaaaaggatcttcacctagatcctTtcaaacagaaaggccatggG (SEQ ID NO: 195) (N23)pLac(RBS1)TtcaaacagaaaggccatggGcaacgcaattaatgtgagttagctcactcattaggcaccccaggctttacactttatgcttccggctcgtatgttgtgtggaattgtgagcggataacaatagaaataattttgtttaactttaagaaggagatatacatatg (SEQ ID NO: 196)(N23)LacIT7(RBS1) tcaaacagaaaggccatggGgaaactacccataatacaagaaaagcccgtcacgggcttctcagggcgttttatggcgggtctgctatgtggtgctatctgactttttgctgttcagcagttcctgccctctgattttccagtctgaccacttcggattatcccgtgacaggtcattcagactggctaatgcacccagtaaggcagcggtatcatcaacaggcttacccgtcttactgtcgggaattcgcgttggccgattcattaatgcagctggcacgacaggtttcctctagatttcagtgcaatttatctcttcaaatgtagcacctgaagtcagccccatacgatataagttgtaattctcatgttagtcatgccccgcgcccaccggaaggagctgactgggttgaaggctctcaagggcatcggtcgagatcccggtgcctaatgagtgagctaacttacattaattgcgttgcgctcactgcccgctttccagtcgggaaacctgtcgtgccagctgcattaatgaatcggccaacgcgcggggagaggcggtttgcgtattgggcgccagggtggtttttcttttcaccagtgagacgggcaacagctgattgcccttcaccgcctggccctgagagagttgcagcaagcggtccacgctggtttgccccagcaggcgaaaatcctgtttgatggtggttaacggcgggatataacatgagctgtcttcggtatcgtcgtatcccactaccgagatgtccgcaccaacgcgcagcccggactcggtaatggcgcgcattgcgcccagcgccatctgatcgttggcaaccagcatcgcagtgggaacgatgccctcattcagcatttgcatggtttgttgaaaaccggacatggcactccagtcgccttcccgttccgctatcggctgaatttgattgcgagtgagatatttatgccagccagccagacgcagacgcgccgagacagaacttaatgggcccgctaacagcgcgatttgctggtgacccaatgcgaccagatgctccacgcccagtcgcgtaccgtcttcatgggagaaaataatactgttgatgggtgtctggtcagagacatcaagaaataacgccggaacattagtgcaggcagcttccacagcaatggcatcctggtcatccagcggatagttaatgatcagcccactgacgcgttgcgcgagaagattgtgcaccgccgctttacaggcttcgacgccgcttcgttctaccatcgacaccaccacgctggcacccagttgatcggcgcgagatttaatcgccgcgacaatttgcgacggcgcgtgcagggccagactggaggtggcaacgccaatcagcaacgactgtttgcccgccagttgttgtgccacgcggttgggaatgtaattcagctccgccatcgccgcttccactttttcccgcgttttcgcagaaacgtggctggcctggttcaccacgcgggaaacggtctgataagagacaccggcatactctgcgacatcgtataacgttactggtttcacattcaccaccctgaattgactctcttccgggcgctatcatgccataccgcgaaaggttttgcgccattcgatggtgtccgggatctcgacgctctcccttatgcgactcctgcattaggaaattaatacgactcactataggggaattgtgagcggataacaattcccctctagaaataattttgtttaactttaagaaggagatatacatatg(SEQ ID NO: 197) (N31)t7t(N21)TgatgggctgaagggtttaaGggctgctaacaaagcccgaaaggaagctgagttggctgctgccaccgctgagcaataactagcataaccccttggggcctctaaacgggtcttgaggggttttttgctgaaaggaggaactatatccggatatcccgcaagaggcccggcagtaccggcataaccaagcctatgcctacagcatccagggtgacggtgccTTccctcgactcacacttgGG (SEQ ID NO: 198 (N31)t7t(LK3)TgatgggctgaagggtttaaGggctgctaacaaagcccgaaaggaagctgagttggctgctgccaccgctgagcaataactagcataaccccttggggcctctaaacgggtcttgaggggttttttgctgaaaggaggaactatatccggatatcccgcaagaggcccggcagtaccggcataaccaagcctatgcctacagcatccagggtgacggtgccTcaggctcgtcttcttcaggG (SEQ ID NO: 199) (LK3)LacI(N21)TcaggctcgtcttcttcaggGgaaactacccataatacaagaaaagcccgtcacgggcttctcagggcgttttatggcgggtctgctatgtggtgctatctgactttttgctgttcagcagttcctgccctctgattttccagtctgaccacttcggattatcccgtgacaggtcattcagactggctaatgcacccagtaaggcagcggtatcatcaacaggcttacccgtcttactgtcgggaattcgcgttggccgattcattaatgcagctggcacgacaggtttcctctagatttcagtgcaatttatctcttcaaatgtagcacctgaagtcagccccatacgatataagttgtaattctcatgttagtcatgccccgcgcccaccggaaggagctgactgggttgaaggctctcaagggcatcggtcgagatcccggtgcctaatgagtgagctaacttacattaattgcgttgcgctcactgcccgctttccagtcgggaaacctgtcgtgccagctgcattaatgaatcggccaacgcgcggggagaggcggtttgcgtattgggcgccagggtggtttttcttttcaccagtgagacgggcaacagctgattgcccttcaccgcctggccctgagagagttgcagcaagcggtccacgctggtttgccccagcaggcgaaaatcctgtttgatggtggttaacggcgggatataacatgagctgtcttcggtatcgtcgtatcccactaccgagatgtccgcaccaacgcgcagcccggactcggtaatggcgcgcattgcgcccagcgccatctgatcgttggcaaccagcatcgcagtgggaacgatgccctcattcagcatttgcatggtttgttgaaaaccggacatggcactccagtcgccttcccgttccgctatcggctgaatttgattgcgagtgagatatttatgccagccagccagacgcagacgcgccgagacagaacttaatgggcccgctaacagcgcgatttgctggtgacccaatgcgaccagatgctccacgcccagtcgcgtaccgtcttcatgggagaaaataatactgttgatgggtgtctggtcagagacatcaagaaataacgccggaacattagtgcaggcagcttccacagcaatggcatcctggtcatccagcggatagttaatgatcagcccactgacgcgttgcgcgagaagattgtgcaccgccgctttacaggcttcgacgccgcttcgttctaccatcgacaccaccacgctggcacccagttgatcggcgcgagatttaatcgccgcgacaatttgcgacggcgcgtgcagggccagactggaggtggcaacgccaatcagcaacgactgtttgcccgccagttgttgtgccacgcggttgggaatgtaattcagctccgccatcgccgcttccactttttcccgcgttttcgcagaaacgtggctggcctggttcaccacgcgggaaacggtctgataagagacaccggcatactctgcgacatcgtataacgttactggtttcacattcaccaccctgaattgactctcttccgggcgctatcatgccataccgcgaaaggttttgcgccattcgatggtgtccgggaTTccctcgactcacacttgGG (SEQ ID NO: 200) (RBS1)dxs(RBS2stop)tagaaataattttgtttaactttaagaaggagatatacatatgagttttgatattgccaaatacccgaccctggcactggtcgactccacccaggagttacgactgttgccgaaagagagtttaccgaaactctgcgacgaactgcgccgctatttactcgacagcgtgagccgttccagcgggcacttcgcctccgggctgggcacggtcgaactgaccgtggcgctgcactatgtctacaacaccccgtttgaccaattgatttgggatgtggggcatcaggcttatccgcataaaattttgaccggacgccgcgacaaaatcggcaccatccgtcagaaaggcggtctgcacccgttcccgtggcgcggcgaaagcgaatatgacgtattaagcgtcgggcattcatcaacctccatcagtgccggaattggtattgcggttgctgccgaaaaagaaggcaaaaatcgccgcaccgtctgtgtcattggcgatggcgcgattaccgcaggcatggcgtttgaagcgatgaatcacgcgggcgatatccgtcctgatatgctggtgattctcaacgacaatgaaatgtcgatttccgaaaatgtcggcgcgctcaacaaccatctggcacagctgctttccggtaagctttactcttcactgcgcgaaggcgggaaaaaagttttctctggcgtgccgccaattaaagagctgctcaaacgcaccgaagaacatattaaaggcatggtagtgcctggcacgttgtttgaagagctgggctttaactacatcggcccggtggacggtcacgatgtgctggggcttatcaccacgctaaagaacatgcgcgacctgaaaggcccgcagttcctgcatatcatgaccaaaaaaggtcgtggttatgaaccggcagaaaaagacccgatcactttccacgccgtgcctaaatttgatccctccagcggttgtttgccgaaaagtagcggcggtttgccgagctattcaaaaatctttggcgactggttgtgcgaaacggcagcgaaagacaacaagctgatggcgattactccggcgatgcgtgaaggttccggcatggtcgagttttcacgtaaattcccggatcgctacttcgacgtggcaattgccgagcaacacgcggtgacctttgctgcgggtctggcgattggtgggtacaaacccattgtcgcgatttactccactttcctgcaacgcgcctatgatcaggtgctgcatgacgtggcgattcaaaagcttccggtcctgttcgccatcgaccgcgcgggcattgttggtgctgacggtcaaacccatcagggtgcttttgatctctcttacctgcgctgcataccggaaatggtcattatgaccccgagcgatgaaaacgaatgtcgccagatgctctataccggctatcactataacgatggcccgtcagcggtgcgctacccgcgtggcaacgcggtcggcgtggaactgacgccgctggaaaaactaccaattggcaaaggcattgtgaagcgtcgtggcgagaaactggcgatccttaactttggtacgctgatgccagaagcggcgaaagtcgccgaatcgctgaacgccacgctggtcgatatgcgttttgtgaaaccgcttgatgaagcgttaattctggaaatggccgccagccatgaagcgctggtcaccgtagaagaaaacgccattatgggcggcgcaggcagcggcgtgaacgaagtgctgatggcccatcgtaaaccagtacccgtgctgaacattggcctgccggacttctttattccgcaaggaactcaggaagaaatgcgcgccgaactcggcctcgatgccgctggtatggaagccaaaatcaaggcctggctggcaTaaccgttcatttatcacaaaaggattgtt cgatG(SEQ ID NO: 201) (RBS2stop)idi(RBS3stop)TaaccgttcatttatcacaaaaggattgttcgatGCAAACGGAACACGTCATTTTATTGAATGCACAGGGAGTTCCCACGGGTACGCTGGAAAAGTATGCCGCACACACGGCAGACACCCGCTTACATCTCGCGTTCTCCAGTTGGCTGTTTAATGCCAAAGGACAATTATTAGTTACCCGCCGCGCACTGAGCAAAAAAGCATGGCCTGGCGTGTGGACTAACTCGGTTTGTGGGCACCCACAACTGGGAGAAAGCAACGAAGACGCAGTGATCCGCCGTTGCCGTTATGAGCTTGGCGTGGAAATTACGCCTCCTGAATCTATCTATCCTGACTTTCGCTACCGCGCCACCGATCCGAGTGGCATTGTGGAAAATGAAGTGTGTCCGGTATTTGCCGCACGCACCACTAGTGCGTTACAGATCAATGATGATGAAGTGATGGATTATCAATGGTGTGATTTAGCAGATGTATTACACGGTATTGATGCCACGCCGTGGGCGTTCAGTCCGTGGATGGTGATGCAGGCGACAAATCGCGAAGCCAGAAAACGATTATCTGCATTTACCCAGCTTAAATgattcacacaggaaacagctat G (SEQ ID NO: 202)(RBS3stop)valC(RBS4stop) TgattcacacaggaaacagctatGGCCGAGATGTTCAACGGCAACTCTTCTAACGACGGATCTTCTTGCATGCCCGTGAAGGACGCCCTGCGACGAACCGGCAACCACCACCCCAACCTGTGGACCGACGACTTCATCCAGTCTCTGAACTCTCCCTACTCTGACTCTTCTTACCACAAGCACCGAGAGATCCTGATCGACGAGATCCGAGACATGTTCTCTAACGGCGAGGGCGACGAGTTCGGCGTGCTCGAGAACATCTGGTTCGTGGACGTGGTGCAGCGACTGGGCATCGACCGACACTTCCAGGAGGAGATCAAGACCGCCCTGGACTACATCTACAAGTTCTGGAACCACGACTCTATCTTCGGCGACCTGAACATGGTGGCCCTGGGCTTCCGAATCCTGCGACTGAACCGATACGTGGCCTCTTCTGACGTGTTCAAGAAGTTCAAGGGCGAGGAGGGCCAGTTCTCTGGCTTCGAGTCCTCTGACCAGGACGCTAAGCTCGAAATGATGCTGAACCTGTACAAGGCCTCTGAGCTGGACTTCCCCGACGAGGACATCCTGAAGGAGGCCCGAGCCTTCGCCTCTATGTACCTGAAGCACGTGATCAAGGAGTACGGCGACATCCAGGAGTCTAAGAACCCCCTGCTGATGGAGATCGAGTACACCTTCAAGTACCCCTGGCGATGCCGACTGCCCCGACTCGAGGCCTGGAACTTCATCCACATCATGCGACAGCAGGACTGCAACATCTCTCTGGCCAACAACCTCTACAAGATCCCCAAGATCTACATGAAGAAGATCCTCGAGCTGGCCATCCTGGACTTCAACATCCTGCAGTCTCAGCACCAGCACGAGATGAAGCTGATCTCTACCTGGTGGAAGAACTCTTCTGCTATCCAGCTGGACTTCTTCCGACACCGACACATCGAGTCTTACTTTTGGTGGGCCTCGCCCCTGTTCGAGCCCGAGTTCTCTACCTGCCGAATCAACTGCACCAAGCTGTCTACCAAGATGTTCCTGCTGGACGACATCTACGACACCTACGGCACCGTCGAGGAGCTGAAGCCCTTCACCACCACCCTGACCCGATGGGACGTGTCTACCGTGGACAACCACCCCGACTACATGAAGATCGCCTTCAACTTCTCTTACGAGATCTACAAGGAGATCGCCTCTGAGGCCGAGCGAAAGCACGGCCCCTTCGTGTACAAGTACCTGCAGTCTTGCTGGAAGTCTTACATCGAGGCCTACATGCAGGAGGCCGAGTGGATCGCCTCTAACCACATCCCCGGCTTCGACGAGTACCTGATGAACGGCGTGAAGTCCTCTGGCATGCGAATCCTGATGATCCACGCCCTGATCCTGATGGACACCCCCCTGTCTGACGAGATTCTCGAGCAGCTGGACATCCCCTCGTCTAAGTCTCAGGCCCTGCTGTCTCTGATCACCCGACTGGTGGACGACGTGAAGGACTTCGAGGACGAGCAGGCCCACGGCGAGATGGCCTCTTCTATCGAGTGCTACATGAAGGACAACCACGGCTCTACCCGAGAGGACGCCCTGAACTACCTGAAGATCCGAATCGAGTCTTGCGTGCAGGAGCTGAACAAGGAGCTGCTCGAGCCCTCTAACATGCACGGATCTTTCCGAAACCTGTACCTGAACGTGGGAATGCGAGTGATTTTCTTCATGCTGAACGACGGCGACCTGTTCACCCACTCTAACCGAAAGGAGATCCAGGACGCCATCACCAAGTTCTTCGTCGAGCCCATCATCCCCTaaattaattgttcttttttcaggtgaaggttcccatG (SEQ ID NO: 203)(RBS4stop)ispA(N31)TaaattaattgttcttttttcaggtgaaggttcccatGGACTTTCCGCAGCAACTCGAAGCCTGCGTTAAGCAGGCCAACCAGGCGCTGAGCCGTTTTATCGCCCCACTGCCCTTTCAGAACACTCCCGTGGTCGAAACCATGCAGTATGGCGCATTATTAGGTGGTAAGCGCCTGCGACCTTTCCTGGTTTATGCCACCGGTCATATGTTCGGCGTTAGCACAAACACGCTGGACGCACCCGCTGCCGCCGTTGAGTGTATCCACGCTTACTCATTAATTCATGATGATTTACCGGCAATGGATGATGACGATCTGCGTCGCGGTTTGCCAACCTGCCATGTGAAGTTTGGCGAAGCAAACGCGATTCTCGCTGGCGACGCTTTACAAACGCTGGCGTTCTCGATTTTAAGCGATGCCGATATGCCGGAAGTGTCGGACCGCGACAGAATTTCGATGATTTCTGAACTGGCGAGCGCCAGTGGTATTGCCGGAATGTGCGGTGGTCAGGCATTAGATTTAGACGCGGAAGGCAAACACGTACCTCTGGACGCGCTTGAGCGTATTCATCGTCATAAAACCGGCGCATTGATTCGCGCCGCCGTTCGCCTTGGTGCATTAAGCGCCGGAGATAAAGGACGTCGTGCTCTGCCGGTACTCGACAAGTATGCAGAGAGCATCGGCCTTGCCTTCCAGGTTCAGGATGACATCCTGGATGTGGTGGGAGATACTGCAACGTTGGGAAAACGCCAGGGTGCCGACCAGCAACTTGGTAAAAGTACCTACCCTGCACTTCTGGGTCTTGAGCAAGCCCGGAAGAAAGCCCGGGATCTGATCGACGATGCCCGTCAGTCGCTGAAACAACTGGCTGAACAGTCACTCGATACCTCGGCACTGGAAGCGCTAGCGGACTACATCATCCAGCGTAATAAATgatgggctgaagggttt aaG (SEQ ID NO: 204)(RBS1)idi(RBS2stop)tagaaataattttgtttaactttaagaaggagatatacatatgCAAACGGAACACGTCATTTTATTGAATGCACAGGGAGTTCCCACGGGTACGCTGGAAAAGTATGCCGCACACACGGCAGACACCCGCTTACATCTCGCGTTCTCCAGTTGGCTGTTTAATGCCAAAGGACAATTATTAGTTACCCGCCGCGCACTGAGCAAAAAAGCATGGCCTGGCGTGTGGACTAACTCGGTTTGTGGGCACCCACAACTGGGAGAAAGCAACGAAGACGCAGTGATCCGCCGTTGCCGTTATGAGCTTGGCGTGGAAATTACGCCTCCTGAATCTATCTATCCTGACTTTCGCTACCGCGCCACCGATCCGAGTGGCATTGTGGAAAATGAAGTGTGTCCGGTATTTGCCGCACGCACCACTAGTGCGTTACAGATCAATGATGATGAAGTGATGGATTATCAATGGTGTGATTTAGCAGATGTATTACACGGTATTGATGCCACGCCGTGGGCGTTCAGTCCGTGGATGGTGATGCAGGCGACAAATCGCGAAGCCAGAAAACGATTATCTGCATTTACCCAGCTTAAATaaccgttcatttatcacaaaaggattgttcgatG (SEQ ID NO: 205) (RBS2stop)dxs(RBS3stop)TaaccgttcatttatcacaaaaggattgttcgatGagttttgatattgccaaatacccgaccctggcactggtcgactccacccaggagttacgactgttgccgaaagagagtttaccgaaactctgcgacgaactgcgccgctatttactcgacagcgtgagccgttccagcgggcacttcgcctccgggctgggcacggtcgaactgaccgtggcgctgcactatgtctacaacaccccgtttgaccaattgatttgggatgtggggcatcaggcttatccgcataaaattttgaccggacgccgcgacaaaatcggcaccatccgtcagaaaggcggtctgcacccgttcccgtggcgcggcgaaagcgaatatgacgtattaagcgtcgggcattcatcaacctccatcagtgccggaattggtattgcggttgctgccgaaaaagaaggcaaaaatcgccgcaccgtctgtgtcattggcgatggcgcgattaccgcaggcatggcgtttgaagcgatgaatcacgcgggcgatatccgtcctgatatgctggtgattctcaacgacaatgaaatgtcgatttccgaaaatgtcggcgcgctcaacaaccatctggcacagctgctttccggtaagctttactcttcactgcgcgaaggcgggaaaaaagttttctctggcgtgccgccaattaaagagctgctcaaacgcaccgaagaacatattaaaggcatggtagtgcctggcacgttgtttgaagagctgggctttaactacatcggcccggtggacggtcacgatgtgctggggcttatcaccacgctaaagaacatgcgcgacctgaaaggcccgcagttcctgcatatcatgaccaaaaaaggtcgtggttatgaaccggcagaaaaagacccgatcactttccacgccgtgcctaaatttgatccctccagcggttgtttgccgaaaagtagcggcggtttgccgagctattcaaaaatctttggcgactggttgtgcgaaacggcagcgaaagacaacaagctgatggcgattactccggcgatgcgtgaaggttccggcatggtcgagttttcacgtaaattcccggatcgctacttcgacgtggcaattgccgagcaacacgcggtgacctttgctgcgggtctggcgattggtgggtacaaacccattgtcgcgatttactccactttcctgcaacgcgcctatgatcaggtgctgcatgacgtggcgattcaaaagcttccggtcctgttcgccatcgaccgcgcgggcattgttggtgctgacggtcaaacccatcagggtgcttttgatctctcttacctgcgctgcataccggaaatggtcattatgaccccgagcgatgaaaacgaatgtcgccagatgctctataccggctatcactataacgatggcccgtcagcggtgcgctacccgcgtggcaacgcggtcggcgtggaactgacgccgctggaaaaactaccaattggcaaaggcattgtgaagcgtcgtggcgagaaactggcgatccttaactttggtacgctgatgccagaagcggcgaaagtcgccgaatcgctgaacgccacgctggtcgatatgcgttttgtgaaaccgcttgatgaagcgttaattctggaaatggccgccagccatgaagcgctggtcaccgtagaagaaaacgccattatgggcggcgcaggcagcggcgtgaacgaagtgctgatggcccatcgtaaaccagtacccgtgctgaacattggcctgccggacttctttattccgcaaggaactcaggaagaaatgcgcgccgaactcggcctcgatgccgctggtatggaagccaaaatcaaggcctggctggcaTgattcacacaggaaacagctatG (SEQ ID NO: 206)(RBS3stop)dxs(RBS4stop)TgattcacacaggaaacagctatGagttttgatattgccaaatacccgaccctggcactggtcgactccacccaggagttacgactgttgccgaaagagagtttaccgaaactctgcgacgaactgcgccgctatttactcgacagcgtgagccgttccagcgggcacttcgcctccgggctgggcacggtcgaactgaccgtggcgctgcactatgtctacaacaccccgtttgaccaattgatttgggatgtggggcatcaggcttatccgcataaaattttgaccggacgccgcgacaaaatcggcaccatccgtcagaaaggcggtctgcacccgttcccgtggcgcggcgaaagcgaatatgacgtattaagcgtcgggcattcatcaacctccatcagtgccggaattggtattgcggttgctgccgaaaaagaaggcaaaaatcgccgcaccgtctgtgtcattggcgatggcgcgattaccgcaggcatggcgtttgaagcgatgaatcacgcgggcgatatccgtcctgatatgctggtgattctcaacgacaatgaaatgtcgatttccgaaaatgtcggcgcgctcaacaaccatctggcacagctgctttccggtaagctttactcttcactgcgcgaaggcgggaaaaaagttttctctggcgtgccgccaattaaagagctgctcaaacgcaccgaagaacatattaaaggcatggtagtgcctggcacgttgtttgaagagctgggctttaactacatcggcccggtggacggtcacgatgtgctggggcttatcaccacgctaaagaacatgcgcgacctgaaaggcccgcagttcctgcatatcatgaccaaaaaaggtcgtggttatgaaccggcagaaaaagacccgatcactttccacgccgtgcctaaatttgatccctccagcggttgtttgccgaaaagtagcggcggtttgccgagctattcaaaaatctttggcgactggttgtgcgaaacggcagcgaaagacaacaagctgatggcgattactccggcgatgcgtgaaggttccggcatggtcgagttttcacgtaaattcccggatcgctacttcgacgtggcaattgccgagcaacacgcggtgacctttgctgcgggtctggcgattggtgggtacaaacccattgtcgcgatttactccactttcctgcaacgcgcctatgatcaggtgctgcatgacgtggcgattcaaaagcttccggtcctgttcgccatcgaccgcgcgggcattgttggtgctgacggtcaaacccatcagggtgcttttgatctctcttacctgcgctgcataccggaaatggtcattatgaccccgagcgatgaaaacgaatgtcgccagatgctctataccggctatcactataacgatggcccgtcagcggtgcgctacccgcgtggcaacgcggtcggcgtggaactgacgccgctggaaaaactaccaattggcaaaggcattgtgaagcgtcgtggcgagaaactggcgatccttaactttggtacgctgatgccagaagcggcgaaagtcgccgaatcgctgaacgccacgctggtcgatatgcgttttgtgaaaccgcttgatgaagcgttaattctggaaatggccgccagccatgaagcgctggtcaccgtagaagaaaacgccattatgggcggcgcaggcagcggcgtgaacgaagtgctgatggcccatcgtaaaccagtacccgtgctgaacattggcctgccggacttctttattccgcaaggaactcaggaagaaatgcgcgccgaactcggcctcgatgccgctggtatggaagccaaaatcaaggcctggctggcaTaaattaattgttcttttttcaggtgaaggttcccatG (SEQ ID NO: 207)(RBS4stop)dxs(N31)TaaattaattgttcttttttcaggtgaaggttcccatGagttttgatattgccaaatacccgaccctggcactggtcgactccacccaggagttacgactgttgccgaaagagagtttaccgaaactctgcgacgaactgcgccgctatttactcgacagcgtgagccgttccagcgggcacttcgcctccgggctgggcacggtcgaactgaccgtggcgctgcactatgtctacaacaccccgtttgaccaattgatttgggatgtggggcatcaggcttatccgcataaaattttgaccggacgccgcgacaaaatcggcaccatccgtcagaaaggcggtctgcacccgttcccgtggcgcggcgaaagcgaatatgacgtattaagcgtcgggcattcatcaacctccatcagtgccggaattggtattgcggttgctgccgaaaaagaaggcaaaaatcgccgcaccgtctgtgtcattggcgatggcgcgattaccgcaggcatggcgtttgaagcgatgaatcacgcgggcgatatccgtcctgatatgctggtgattctcaacgacaatgaaatgtcgatttccgaaaatgtcggcgcgctcaacaaccatctggcacagctgctttccggtaagctttactcttcactgcgcgaaggcgggaaaaaagttttctctggcgtgccgccaattaaagagctgctcaaacgcaccgaagaacatattaaaggcatggtagtgcctggcacgttgtttgaagagctgggctttaactacatcggcccggtggacggtcacgatgtgctggggcttatcaccacgctaaagaacatgcgcgacctgaaaggcccgcagttcctgcatatcatgaccaaaaaaggtcgtggttatgaaccggcagaaaaagacccgatcactttccacgccgtgcctaaatttgatccctccagcggttgtttgccgaaaagtagcggcggtttgccgagctattcaaaaatctttggcgactggttgtgcgaaacggcagcgaaagacaacaagctgatggcgattactccggcgatgcgtgaaggttccggcatggtcgagttttcacgtaaattcccggatcgctacttcgacgtggcaattgccgagcaacacgcggtgacctttgctgcgggtctggcgattggtgggtacaaacccattgtcgcgatttactccactttcctgcaacgcgcctatgatcaggtgctgcatgacgtggcgattcaaaagcttccggtcctgttcgccatcgaccgcgcgggcattgttggtgctgacggtcaaacccatcagggtgcttttgatctctcttacctgcgctgcataccggaaatggtcattatgaccccgagcgatgaaaacgaatgtcgccagatgctctataccggctatcactataacgatggcccgtcagcggtgcgctacccgcgtggcaacgcggtcggcgtggaactgacgccgctggaaaaactaccaattggcaaaggcattgtgaagcgtcgtggcgagaaactggcgatccttaactttggtacgctgatgccagaagcggcgaaagtcgccgaatcgctgaacgccacgctggtcgatatgcgttttgtgaaaccgcttgatgaagcgttaattctggaaatggccgccagccatgaagcgctggtcaccgtagaagaaaacgccattatgggcggcgcaggcagcggcgtgaacgaagtgctgatggcccatcgtaaaccagtacccgtgctgaacattggcctgccggacttctttattccgcaaggaactcaggaagaaatgcgcgccgaactcggcctcgatgccgctggtatggaagccaaaatcaaggcctggctggcaTgatgggctgaagggtttaaG (SEQ ID NO: 208)(RBS1)valC(RBS2stop)tagaaataattttgtttaactttaagaaggagatatacatatgGCCGAGATGTTCAACGGCAACTCTTCTAACGACGGATCTTCTTGCATGCCCGTGAAGGACGCCCTGCGACGAACCGGCAACCACCACCCCAACCTGTGGACCGACGACTTCATCCAGTCTCTGAACTCTCCCTACTCTGACTCTTCTTACCACAAGCACCGAGAGATCCTGATCGACGAGATCCGAGACATGTTCTCTAACGGCGAGGGCGACGAGTTCGGCGTGCTCGAGAACATCTGGTTCGTGGACGTGGTGCAGCGACTGGGCATCGACCGACACTTCCAGGAGGAGATCAAGACCGCCCTGGACTACATCTACAAGTTCTGGAACCACGACTCTATCTTCGGCGACCTGAACATGGTGGCCCTGGGCTTCCGAATCCTGCGACTGAACCGATACGTGGCCTCTTCTGACGTGTTCAAGAAGTTCAAGGGCGAGGAGGGCCAGTTCTCTGGCTTCGAGTCCTCTGACCAGGACGCTAAGCTCGAAATGATGCTGAACCTGTACAAGGCCTCTGAGCTGGACTTCCCCGACGAGGACATCCTGAAGGAGGCCCGAGCCTTCGCCTCTATGTACCTGAAGCACGTGATCAAGGAGTACGGCGACATCCAGGAGTCTAAGAACCCCCTGCTGATGGAGATCGAGTACACCTTCAAGTACCCCTGGCGATGCCGACTGCCCCGACTCGAGGCCTGGAACTTCATCCACATCATGCGACAGCAGGACTGCAACATCTCTCTGGCCAACAACCTCTACAAGATCCCCAAGATCTACATGAAGAAGATCCTCGAGCTGGCCATCCTGGACTTCAACATCCTGCAGTCTCAGCACCAGCACGAGATGAAGCTGATCTCTACCTGGTGGAAGAACTCTTCTGCTATCCAGCTGGACTTCTTCCGACACCGACACATCGAGTCTTACTTTTGGTGGGCCTCGCCCCTGTTCGAGCCCGAGTTCTCTACCTGCCGAATCAACTGCACCAAGCTGTCTACCAAGATGTTCCTGCTGGACGACATCTACGACACCTACGGCACCGTCGAGGAGCTGAAGCCCTTCACCACCACCCTGACCCGATGGGACGTGTCTACCGTGGACAACCACCCCGACTACATGAAGATCGCCTTCAACTTCTCTTACGAGATCTACAAGGAGATCGCCTCTGAGGCCGAGCGAAAGCACGGCCCCTTCGTGTACAAGTACCTGCAGTCTTGCTGGAAGTCTTACATCGAGGCCTACATGCAGGAGGCCGAGTGGATCGCCTCTAACCACATCCCCGGCTTCGACGAGTACCTGATGAACGGCGTGAAGTCCTCTGGCATGCGAATCCTGATGATCCACGCCCTGATCCTGATGGACACCCCCCTGTCTGACGAGATTCTCGAGCAGCTGGACATCCCCTCGTCTAAGTCTCAGGCCCTGCTGTCTCTGATCACCCGACTGGTGGACGACGTGAAGGACTTCGAGGACGAGCAGGCCCACGGCGAGATGGCCTCTTCTATCGAGTGCTACATGAAGGACAACCACGGCTCTACCCGAGAGGACGCCCTGAACTACCTGAAGATCCGAATCGAGTCTTGCGTGCAGGAGCTGAACAAGGAGCTGCTCGAGCCCTCTAACATGCACGGATCTTTCCGAAACCTGTACCTGAACGTGGGAATGCGAGTGATTTTCTTCATGCTGAACGACGGCGACCTGTTCACCCACTCTAACCGAAAGGAGATCCAGGACGCCATCACCAAGTTCTTCGTCGAGCCCATCATCCCCTaaccgttcatttatcacaaaaggattgttcgatG (SEQ ID NO: 209)(RBS2stop)valC(RBS3stop)TaaccgttcatttatcacaaaaggattgttcgatGGCCGAGATGTTCAACGGCAACTCTTCTAACGACGGATCTTCTTGCATGCCCGTGAAGGACGCCCTGCGACGAACCGGCAACCACCACCCCAACCTGTGGACCGACGACTTCATCCAGTCTCTGAACTCTCCCTACTCTGACTCTTCTTACCACAAGCACCGAGAGATCCTGATCGACGAGATCCGAGACATGTTCTCTAACGGCGAGGGCGACGAGTTCGGCGTGCTCGAGAACATCTGGTTCGTGGACGTGGTGCAGCGACTGGGCATCGACCGACACTTCCAGGAGGAGATCAAGACCGCCCTGGACTACATCTACAAGTTCTGGAACCACGACTCTATCTTCGGCGACCTGAACATGGTGGCCCTGGGCTTCCGAATCCTGCGACTGAACCGATACGTGGCCTCTTCTGACGTGTTCAAGAAGTTCAAGGGCGAGGAGGGCCAGTTCTCTGGCTTCGAGTCCTCTGACCAGGACGCTAAGCTCGAAATGATGCTGAACCTGTACAAGGCCTCTGAGCTGGACTTCCCCGACGAGGACATCCTGAAGGAGGCCCGAGCCTTCGCCTCTATGTACCTGAAGCACGTGATCAAGGAGTACGGCGACATCCAGGAGTCTAAGAACCCCCTGCTGATGGAGATCGAGTACACCTTCAAGTACCCCTGGCGATGCCGACTGCCCCGACTCGAGGCCTGGAACTTCATCCACATCATGCGACAGCAGGACTGCAACATCTCTCTGGCCAACAACCTCTACAAGATCCCCAAGATCTACATGAAGAAGATCCTCGAGCTGGCCATCCTGGACTTCAACATCCTGCAGTCTCAGCACCAGCACGAGATGAAGCTGATCTCTACCTGGTGGAAGAACTCTTCTGCTATCCAGCTGGACTTCTTCCGACACCGACACATCGAGTCTTACTTTTGGTGGGCCTCGCCCCTGTTCGAGCCCGAGTTCTCTACCTGCCGAATCAACTGCACCAAGCTGTCTACCAAGATGTTCCTGCTGGACGACATCTACGACACCTACGGCACCGTCGAGGAGCTGAAGCCCTTCACCACCACCCTGACCCGATGGGACGTGTCTACCGTGGACAACCACCCCGACTACATGAAGATCGCCTTCAACTTCTCTTACGAGATCTACAAGGAGATCGCCTCTGAGGCCGAGCGAAAGCACGGCCCCTTCGTGTACAAGTACCTGCAGTCTTGCTGGAAGTCTTACATCGAGGCCTACATGCAGGAGGCCGAGTGGATCGCCTCTAACCACATCCCCGGCTTCGACGAGTACCTGATGAACGGCGTGAAGTCCTCTGGCATGCGAATCCTGATGATCCACGCCCTGATCCTGATGGACACCCCCCTGTCTGACGAGATTCTCGAGCAGCTGGACATCCCCTCGTCTAAGTCTCAGGCCCTGCTGTCTCTGATCACCCGACTGGTGGACGACGTGAAGGACTTCGAGGACGAGCAGGCCCACGGCGAGATGGCCTCTTCTATCGAGTGCTACATGAAGGACAACCACGGCTCTACCCGAGAGGACGCCCTGAACTACCTGAAGATCCGAATCGAGTCTTGCGTGCAGGAGCTGAACAAGGAGCTGCTCGAGCCCTCTAACATGCACGGATCTTTCCGAAACCTGTACCTGAACGTGGGAATGCGAGTGATTTTCTTCATGCTGAACGACGGCGACCTGTTCACCCACTCTAACCGAAAGGAGATCCAGGACGCCATCACCAAGTTCTTCGTCGAGCCCATCATCCCCTgattcacacaggaaacagctatG (SEQ ID NO: 210)(RBS3stop)idi(RBS4stop) TgattcacacaggaaacagctatGCAAACGGAACACGTCATTTTATTGAATGCACAGGGAGTTCCCACGGGTACGCTGGAAAAGTATGCCGCACACACGGCAGACACCCGCTTACATCTCGCGTTCTCCAGTTGGCTGTTTAATGCCAAAGGACAATTATTAGTTACCCGCCGCGCACTGAGCAAAAAAGCATGGCCTGGCGTGTGGACTAACTCGGTTTGTGGGCACCCACAACTGGGAGAAAGCAACGAAGACGCAGTGATCCGCCGTTGCCGTTATGAGCTTGGCGTGGAAATTACGCCTCCTGAATCTATCTATCCTGACTTTCGCTACCGCGCCACCGATCCGAGTGGCATTGTGGAAAATGAAGTGTGTCCGGTATTTGCCGCACGCACCACTAGTGCGTTACAGATCAATGATGATGAAGTGATGGATTATCAATGGTGTGATTTAGCAGATGTATTACACGGTATTGATGCCACGCCGTGGGCGTTCAGTCCGTGGATGGTGATGCAGGCGACAAATCGCGAAGCCAGAAAACGATTATCTGCATTTACCCAGCTTAAATaaattaattgttcttttttcaggtgaaggttcccatG (SEQ ID NO: 211)(RBS4stop)idi(N31) TaaattaattgttcttttttcaggtgaaggttcccatGCAAACGGAACACGTCATTTTATTGAATGCACAGGGAGTTCCCACGGGTACGCTGGAAAAGTATGCCGCACACACGGCAGACACCCGCTTACATCTCGCGTTCTCCAGTTGGCTGTTTAATGCCAAAGGACAATTATTAGTTACCCGCCGCGCACTGAGCAAAAAAGCATGGCCTGGCGTGTGGACTAACTCGGTTTGTGGGCACCCACAACTGGGAGAAAGCAACGAAGACGCAGTGATCCGCCGTTGCCGTTATGAGCTTGGCGTGGAAATTACGCCTCCTGAATCTATCTATCCTGACTTTCGCTACCGCGCCACCGATCCGAGTGGCATTGTGGAAAATGAAGTGTGTCCGGTATTTGCCGCACGCACCACTAGTGCGTTACAGATCAATGATGATGAAGTGATGGATTATCAATGGTGTGATTTAGCAGATGTATTACACGGTATTGATGCCACGCCGTGGGCGTTCAGTCCGTGGATGGTGATGCAGGCGACAAATCGCGAAGCCAGAAAACGATTATCTGCATTTACCCAGCTTAAATgatgggctgaagggtttaaG (SEQ ID NO: 212)(RBS1)ispA(RBS2stop)tagaaataattttgtttaactttaagaaggagatatacatatgGACTTTCCGCAGCAACTCGAAGCCTGCGTTAAGCAGGCCAACCAGGCGCTGAGCCGTTTTATCGCCCCACTGCCCTTTCAGAACACTCCCGTGGTCGAAACCATGCAGTATGGCGCATTATTAGGTGGTAAGCGCCTGCGACCTTTCCTGGTTTATGCCACCGGTCATATGTTCGGCGTTAGCACAAACACGCTGGACGCACCCGCTGCCGCCGTTGAGTGTATCCACGCTTACTCATTAATTCATGATGATTTACCGGCAATGGATGATGACGATCTGCGTCGCGGTTTGCCAACCTGCCATGTGAAGTTTGGCGAAGCAAACGCGATTCTCGCTGGCGACGCTTTACAAACGCTGGCGTTCTCGATTTTAAGCGATGCCGATATGCCGGAAGTGTCGGACCGCGACAGAATTTCGATGATTTCTGAACTGGCGAGCGCCAGTGGTATTGCCGGAATGTGCGGTGGTCAGGCATTAGATTTAGACGCGGAAGGCAAACACGTACCTCTGGACGCGCTTGAGCGTATTCATCGTCATAAAACCGGCGCATTGATTCGCGCCGCCGTTCGCCTTGGTGCATTAAGCGCCGGAGATAAAGGACGTCGTGCTCTGCCGGTACTCGACAAGTATGCAGAGAGCATCGGCCTTGCCTTCCAGGTTCAGGATGACATCCTGGATGTGGTGGGAGATACTGCAACGTTGGGAAAACGCCAGGGTGCCGACCAGCAACTTGGTAAAAGTACCTACCCTGCACTTCTGGGTCTTGAGCAAGCCCGGAAGAAAGCCCGGGATCTGATCGACGATGCCCGTCAGTCGCTGAAACAACTGGCTGAACAGTCACTCGATACCTCGGCACTGGAAGCGCTAGCGGACTACATCATCCAGCGTAATAAATaaccgttcatttatcacaaaaggattgttcgatG (SEQ ID NO: 213) (RBS2stop)ispA(RBS3stop)TaaccgttcatttatcacaaaaggattgttcgatGGACTTTCCGCAGCAACTCGAAGCCTGCGTTAAGCAGGCCAACCAGGCGCTGAGCCGTTTTATCGCCCCACTGCCCTTTCAGAACACTCCCGTGGTCGAAACCATGCAGTATGGCGCATTATTAGGTGGTAAGCGCCTGCGACCTTTCCTGGTTTATGCCACCGGTCATATGTTCGGCGTTAGCACAAACACGCTGGACGCACCCGCTGCCGCCGTTGAGTGTATCCACGCTTACTCATTAATTCATGATGATTTACCGGCAATGGATGATGACGATCTGCGTCGCGGTTTGCCAACCTGCCATGTGAAGTTTGGCGAAGCAAACGCGATTCTCGCTGGCGACGCTTTACAAACGCTGGCGTTCTCGATTTTAAGCGATGCCGATATGCCGGAAGTGTCGGACCGCGACAGAATTTCGATGATTTCTGAACTGGCGAGCGCCAGTGGTATTGCCGGAATGTGCGGTGGTCAGGCATTAGATTTAGACGCGGAAGGCAAACACGTACCTCTGGACGCGCTTGAGCGTATTCATCGTCATAAAACCGGCGCATTGATTCGCGCCGCCGTTCGCCTTGGTGCATTAAGCGCCGGAGATAAAGGACGTCGTGCTCTGCCGGTACTCGACAAGTATGCAGAGAGCATCGGCCTTGCCTTCCAGGTTCAGGATGACATCCTGGATGTGGTGGGAGATACTGCAACGTTGGGAAAACGCCAGGGTGCCGACCAGCAACTTGGTAAAAGTACCTACCCTGCACTTCTGGGTCTTGAGCAAGCCCGGAAGAAAGCCCGGGATCTGATCGACGATGCCCGTCAGTCGCTGAAACAACTGGCTGAACAGTCACTCGATACCTCGGCACTGGAAGCGCTAGCGGACTACATCATCCAGCGTAATAAATgattcacacaggaaac agctatG (SEQ ID NO: 214)(RBS3stop)ispA(RBS4stop) TgattcacacaggaaacagctatGGACTTTCCGCAGCAACTCGAAGCCTGCGTTAAGCAGGCCAACCAGGCGCTGAGCCGTTTTATCGCCCCACTGCCCTTTCAGAACACTCCCGTGGTCGAAACCATGCAGTATGGCGCATTATTAGGTGGTAAGCGCCTGCGACCTTTCCTGGTTTATGCCACCGGTCATATGTTCGGCGTTAGCACAAACACGCTGGACGCACCCGCTGCCGCCGTTGAGTGTATCCACGCTTACTCATTAATTCATGATGATTTACCGGCAATGGATGATGACGATCTGCGTCGCGGTTTGCCAACCTGCCATGTGAAGTTTGGCGAAGCAAACGCGATTCTCGCTGGCGACGCTTTACAAACGCTGGCGTTCTCGATTTTAAGCGATGCCGATATGCCGGAAGTGTCGGACCGCGACAGAATTTCGATGATTTCTGAACTGGCGAGCGCCAGTGGTATTGCCGGAATGTGCGGTGGTCAGGCATTAGATTTAGACGCGGAAGGCAAACACGTACCTCTGGACGCGCTTGAGCGTATTCATCGTCATAAAACCGGCGCATTGATTCGCGCCGCCGTTCGCCTTGGTGCATTAAGCGCCGGAGATAAAGGACGTCGTGCTCTGCCGGTACTCGACAAGTATGCAGAGAGCATCGGCCTTGCCTTCCAGGTTCAGGATGACATCCTGGATGTGGTGGGAGATACTGCAACGTTGGGAAAACGCCAGGGTGCCGACCAGCAACTTGGTAAAAGTACCTACCCTGCACTTCTGGGTCTTGAGCAAGCCCGGAAGAAAGCCCGGGATCTGATCGACGATGCCCGTCAGTCGCTGAAACAACTGGCTGAACAGTCACTCGATACCTCGGCACTGGAAGCGCTAGCGGACTACATCATCCAGCGTAATAAATaaattaattgttcttttttcaggtgaaggttcccatG (SEQ ID NO: 215) (RBS4stop)valC(N31)TaaattaattgttcttttttcaggtgaaggttcccatGGCCGAGATGTTCAACGGCAACTCTTCTAACGACGGATCTTCTTGCATGCCCGTGAAGGACGCCCTGCGACGAACCGGCAACCACCACCCCAACCTGTGGACCGACGACTTCATCCAGTCTCTGAACTCTCCCTACTCTGACTCTTCTTACCACAAGCACCGAGAGATCCTGATCGACGAGATCCGAGACATGTTCTCTAACGGCGAGGGCGACGAGTTCGGCGTGCTCGAGAACATCTGGTTCGTGGACGTGGTGCAGCGACTGGGCATCGACCGACACTTCCAGGAGGAGATCAAGACCGCCCTGGACTACATCTACAAGTTCTGGAACCACGACTCTATCTTCGGCGACCTGAACATGGTGGCCCTGGGCTTCCGAATCCTGCGACTGAACCGATACGTGGCCTCTTCTGACGTGTTCAAGAAGTTCAAGGGCGAGGAGGGCCAGTTCTCTGGCTTCGAGTCCTCTGACCAGGACGCTAAGCTCGAAATGATGCTGAACCTGTACAAGGCCTCTGAGCTGGACTTCCCCGACGAGGACATCCTGAAGGAGGCCCGAGCCTTCGCCTCTATGTACCTGAAGCACGTGATCAAGGAGTACGGCGACATCCAGGAGTCTAAGAACCCCCTGCTGATGGAGATCGAGTACACCTTCAAGTACCCCTGGCGATGCCGACTGCCCCGACTCGAGGCCTGGAACTTCATCCACATCATGCGACAGCAGGACTGCAACATCTCTCTGGCCAACAACCTCTACAAGATCCCCAAGATCTACATGAAGAAGATCCTCGAGCTGGCCATCCTGGACTTCAACATCCTGCAGTCTCAGCACCAGCACGAGATGAAGCTGATCTCTACCTGGTGGAAGAACTCTTCTGCTATCCAGCTGGACTTCTTCCGACACCGACACATCGAGTCTTACTTTTGGTGGGCCTCGCCCCTGTTCGAGCCCGAGTTCTCTACCTGCCGAATCAACTGCACCAAGCTGTCTACCAAGATGTTCCTGCTGGACGACATCTACGACACCTACGGCACCGTCGAGGAGCTGAAGCCCTTCACCACCACCCTGACCCGATGGGACGTGTCTACCGTGGACAACCACCCCGACTACATGAAGATCGCCTTCAACTTCTCTTACGAGATCTACAAGGAGATCGCCTCTGAGGCCGAGCGAAAGCACGGCCCCTTCGTGTACAAGTACCTGCAGTCTTGCTGGAAGTCTTACATCGAGGCCTACATGCAGGAGGCCGAGTGGATCGCCTCTAACCACATCCCCGGCTTCGACGAGTACCTGATGAACGGCGTGAAGTCCTCTGGCATGCGAATCCTGATGATCCACGCCCTGATCCTGATGGACACCCCCCTGTCTGACGAGATTCTCGAGCAGCTGGACATCCCCTCGTCTAAGTCTCAGGCCCTGCTGTCTCTGATCACCCGACTGGTGGACGACGTGAAGGACTTCGAGGACGAGCAGGCCCACGGCGAGATGGCCTCTTCTATCGAGTGCTACATGAAGGACAACCACGGCTCTACCCGAGAGGACGCCCTGAACTACCTGAAGATCCGAATCGAGTCTTGCGTGCAGGAGCTGAACAAGGAGCTGCTCGAGCCCTCTAACATGCACGGATCTTTCCGAAACCTGTACCTGAACGTGGGAATGCGAGTGATTTTCTTCATGCTGAACGACGGCGACCTGTTCACCCACTCTAACCGAAAGGAGATCCAGGACGCCATCACCAAGTTCTTCGTCGAGCCCATCATCCCCTgatgggctgaagggtttaaG (SEQ ID NO: 216) (RBS1)aroGTagaaataattttgtttaactttaagaaggagatatacatatgaattatcagaacgacmutant(RBS4stop)gatttacgcatcaaagaaatcaaagagttacttcctcctgtcgcattgctggaaaaattccccgctactgaaaatgccgcgaatacggttgcccatgcccgaaaagcgatccataagatcctgaaaggtaatgatgatcgcctgttggttgtgattggcccatgctcaattcatgatcctgtcgcggcaaaagagtatgccactcgcttgctggcgctgcgtgaagagctgaaagatgagctggaaatcgtaatgcgcgtctattttgaaaagccgcgtaccacggtgggctggaaagggctgattaacgatccgcatatggataatagcttccagatcaacgacggtctgcgtatagcccgtaaattgctgcttgatattaacgacagcggtctgccagcggcaggtgagtttctcaatatgatcaccccacaatatctcgctgacctgatgagctggggcgcaattggcgcacgtaccaccgaatcgcaggtgcaccgcgaactggcatcagggctttcttgtccggtcggcttcaaaaatggcaccgacggtacgattaaagtggctatcgatgccattaatgccgccggtgcgccgcactgcttcctgtccgtaacgaaatgggggcattcggcgattgtgaataccagcggtaacggcgattgccatatcattctgcgcggcggtaaagagcctaactacagcgcgaagcacgttgctgaagtgaaagaagggctgaacaaagcaggcctgccagcacaggtgatgatcgatttcagccatgctaactcgtccaaacaattcaaaaagcagatggatgtttgtgctgacgtttgccagcagattgccggtggcgaaaaggccattattggcgtgatggtggaaagccatctggtggaaggcaatcagagcctcgagagcggggagccgctggcctacggtaagagcatcaccgatgcctgcatcggctgggaagataccgatgctctgttacgtcaactggcgaatgcagtaaaagcgcgtcgcgggTaaattaattgttcttttttcaggtgaaggttcccatG (SEQ ID NO: 217)(RBS1)tyrA mutant(RBS4stop)tagaaataattttgtttaactttaagaaggagatatacatatggttgctgaattgaccgcattacgcgatcaaattgatgaagtcgataaagcgctgctgaatttattagcgaagcgtctggaactggttgctgaagtgggcgaggtgaaaagccgctttggactgcctatttatgttccggagcgcgaggcatctattttggcctcgcgtcgtgcagaggcggaagctctgggtgtaccgccagatctgattgaggatgttttgcgtcgggtgatgcgtgaatcttactccagtgaaaacgacaaaggatttaaaacactttgtccgtcactgcgtccggtggttatcgtcggcggtggcggtcagatgggacgcctgttcgagaagatgctgaccctctcgggttatcaggtgcggattctggagcaacatgactgggatcgagcggctgatattgttgccgatgccggaatggtgattgttagtgtgccaatccacgttactgagcaagttattggcaaattaccgcctttaccgaaagattgtattctggtcgatctggcatcagtgaaaaatgggccattacaggccatgctggtggcgcatgatggtccggtgctggggctacacccgatgttcggtccggacagcggtagcctggcaaagcaagttgtggtctggtgtgatggacgtaaaccggaagcataccaatggtttctggagcaaattcaggtctggggcgctcggctgcatcgtattagcgccgtcgagcacgatcagaatatggcgtttattcaggcactgcgccactttgctacttttgcttacgggctgcacctggcagaagaaaatgttcagcttgagcaacttctggcgctctcttcgccgatttaccgccttgagctggcgatggtcgggcgactgtttgctcaggatccgcagctttatgccgacatcattatgtcgtcagagcgtaat ctggcgttaatcaaacgttactataagcgtttcggcgaggcgattgagttgctggagcagggcgataagcaggcgtttattgacagtttccgcaaggtggagcactggttcggcgattacgtacagcgttttcagagtgaaagccgcgtgttattgcgtcaggcgaatgacaatcgccagTaaattaattgttcttttttcaggtgaaggttcccatG (SEQ ID NO: 218)(RBS4stop)tyrA mutant(N31)TaaattaattgttcttttttcaggtgaaggttcccatGgttgctgaattgaccgcattacgcgatcaaattgatgaagtcgataaagcgctgctgaatttattagcgaagcgtctggaactggttgctgaagtgggcgaggtgaaaagccgctttggactgcctatttatgttccggagcgcgaggcatctattttggcctcgcgtcgtgcagaggcggaagctctgggtgtaccgccagatctgattgaggatgttttgcgtcgggtgatgcgtgaatcttactccagtgaaaacgacaaaggatttaaaacactttgtccgtcactgcgtccggtggttatcgtcggcggtggcggtcagatgggacgcctgttcgagaagatgctgaccctctcgggttatcaggtgcggattctggagcaacatgactgggatcgagcggctgatattgttgccgatgccggaatggtgattgttagtgtgccaatccacgttactgagcaagttattggcaaattaccgcctttaccgaaagattgtattctggtcgatctggcatcagtgaaaaatgggccattacaggccatgctggtggcgcatgatggtccggtgctggggctacacccgatgttcggtccggacagcggtagcctggcaaagcaagttgtggtctggtgtgatggacgtaaaccggaagcataccaatggtttctggagcaaattcaggtctggggcgctcggctgcatcgtattagcgccgtcgagcacgatcagaatatggcgtttattcaggcactgcgccactttgctacttttgcttacgggctgcacctggcagaagaaaatgttcagcttgagcaacttctggcgctctcttcgccgatttaccgccttgagctggcgatggtcgggcgactgtttgctcaggatccgcagctttatgccgacatcattatgtcgtcagagcgtaatctggcgttaatcaaacgttactataagcgtttcggcgaggcgattgagttgctggagcagggcgataagcaggcgtttattgacagtttccgcaaggtggagcactggttcggcgattacgtacagcgttttcagagtgaaagccgcgtgttattgcgtcaggcgaatgacaatcgccagTgatgggctgaagggtttaaG (SEQ ID NO: 219) (RBS4stop)aroG mutant(N31)TaaattaattgttcttttttcaggtgaaggttcccatGaattatcagaacgacgatttacgcatcaaagaaatcaaagagttacttcctcctgtcgcattgctggaaaaattccccgctactgaaaatgccgcgaatacggttgcccatgcccgaaaagcgatccataagatcctgaaaggtaatgatgatcgcctgttggttgtgattggcccatgctcaattcatgatcctgtcgcggcaaaagagtatgccactcgcttgctggcgctgcgtgaagagctgaaagatgagctggaaatcgtaatgcgcgtctattttgaaaagccgcgtaccacggtgggctggaaagggctgattaacgatccgcatatggataatagcttccagatcaacgacggtctgcgtatagcccgtaaattgctgcttgatattaacgacagcggtctgccagcggcaggtgagtttctcaatatgatcaccccacaatatctcgctgacctgatgagctggggcgcaattggcgcacgtaccaccgaatcgcaggtgcaccgcgaactggcatcagggctttcttgtccggtcggcttcaaaaatggcaccgacggtacgattaaagtggctatcgatgccattaatgccgccggtgcgccgcactgcttcctgtccgtaacgaaatgggggcattcggcgattgtgaataccagcggtaacggcgattgccatatcattctgcgcggcggtaaagagcctaactacagcgcgaagcacgttgctgaagtgaaagaagggctgaacaaagcaggcctgccagcacaggtgatgatcgatttcagccatgctaactcgtccaaacaattcaaaaagcagatggatgtttgtgctgacgtttgccagcagattgccggtggcgaaaaggccattattggcgtgatggtggaaagccatctggtggaaggcaatcagagcctcgagagcggggagccgctggcctacggtaagagcatcaccgatgcctgcatcggctgggaagataccgatgctctgttacgtcaactggcgaatgcagtaaaagcgcgtcgcgggTgatgggctgaagggtttaaG (SEQ ID NO: 220)

Assembly of Barcoded UDS Fragment

Previously reported cross-lapping in vitro assembly (CLIVA) suffered lowefficiency of DNA assembly (the success rate of assembling 7 fragmentsis less than 10%)²⁴, and we have demonstrated that it could besubstantially improved by using a thermophilic Taq ligase with longerincubation time in vitro (Data not shown, termed as enhanced CLIVA). Mixeach equimolar barcoded UDS fragments obtained from PCR by universaloligos, and add 0.5 microliters of Taq DNA ligase (NEB, M0208) and 0.5microliters of 10× ligation buffer. Top up the reaction volume to 5microliters by using nuclease-free water. Incubate the solution at 45°C. overnight in PCR tube that is heated up by using a PCR machine.Incubation time of assembly can be 1 to 6 hours for assembly of 2 to 5fragments assembly which may reduce the efficiency of the DNA assembly,since overnight incubation time has been demonstrated to be moreefficient (Data not shown), especially when the size of constructassembled by using 5-7 UDS barcoded fragments is over 10 kb.

Mix 1 to 2 microliters of ligation products with 17 microliters of DH5aheat-shock competent cells (NEB, C2987H), chill on ice for 5 min andincubate the mixture at 42° C. for 35 seconds (in a 1.7 milliliterEppendorf tube immersed in a water bath), then place on ice for 2minutes. Add 150 microliters of SOC medium and plate all of the cells onagar plate with proper selection antibiotics. Colony PCR (10 microlitersvolume) was performed to evaluate the efficiency of plasmid assembly,and all used oligos are list in Table 11. Sequencing confirmed plasmidswith accurate barcode regions were transformed to corresponded E. coliMG1655 strains, and all constructed strains are listed in Tables 12 and13.

TABLE 11 Plasmids and colony PCR Colony number Correct colony Colony PCRColony PCR raised on number out Name Plasmid constructed forward oligosreverse oligos the plate of tested GE1(N21)aadA(N22)pMB1(pJ23119)gRNAaslA pJ23119-Bff N21-Brr Over 100 3 outof 3 (N23)aslAHF0.5(N24)aslAHT0.5 GE2(N21)aadA(N22)pMB1(pJ23119)gRNAaslA pJ23119-Bff N21-Brr Over 100 3 outof 3 (N23)aslAHF1.0(N24)aslAHT1.0 GE3(N21)aadA(N22)pMB1(pJ23119)gRNAnupG pJ23119-Bff N21-Brr Over 100 8 outof 3 (N23)nupGHF0.5(N24)nupGHT0.5 GE4 (N21)bla(N22)pMB1(pJ23119)gRNAtyrRpJ23119-Bff N21-Brr Over 100 4 out of 4 (N23)tyrRHF0.5(N24) tyrRHT0.5GE5 (N21)aadA(N22)pMB1(pJ23119)gRNApheA pJ23119-Bff N21-Brr Over 100 3out of 3 (N23)pheAHF0.5(N24)pheA-HT0.5 GE6(N21)aadA(N22)pMB1(pJ23119)gRNAnupG pJ23119-Bff N21-Brr 10 to 50 2 outof 3 (N23)nupGHF1.0(N24)nupGHT1.0 GE7(N21)aadA(N22)pMB1(pJ23119)gRNAnupG pJ23119-Bff N21-Brr 10 to 50 3 outof 3 (N23)nupGHF1.5(N24)nupGHT1.5 GE8(N21)aadA(N22)pMB1(pJ23119)gRNAmelB pJ23119-Bff N21-Brr 10 to 50 3 outof 3 (N23)nupGHF1.0(N24)nupG-HT1.0 GE9(N21)aadA(N22)pMB1(pJ23119)gRNArcsB pJ23119-Bff N21-Brr 2 1 out of 2(N23)nupGHF1.0(N24)nupGHT1.0 GE10 (N21)aadA(N22)pMB1(pJ23119)gRNAptsIpJ23119-Bff N21-Brr Over 100 1 out of 1 (N23)ptsIHF0.7(N24)ptsIHT0.77IP1 (N21)aadA(N22)p5(N23)laclT7(RBS1)dxs ACAY_ggatctcgacgctctcccttN31-Brr Over 100 5 out of 6 (RBS2stop)idi(RBS3stop)ispA(RBS4stop)valC(N31)t7t IP2 (N21)aadA(N22)p5(N23)laclT7(RBS1)dxsACAY_ggatctcgacgctctccctt N31-Brr Over 100 6 out of 6(RBS2stop)idi(RBS3stop)valC (RBS4stop)ispA(N31)t7t IP3(N21)aadA(N22)p5(N23)laclT7(RBS1)dxs ACAY_ggatctcgacgctctccctt N31-BrrOver 100 6 out of 6 (RBS2stop)valC(RBS3stop)idi (RBS4stop)ispA(N31)t7tIP4 (N21)aadA(N22)p5(N23)laclT7(RBS1)dxs ACAY_ggatctcgacgctctcccttN31-Brr Over 100 6 out of 6 (RBS2stop)valC(RBS3stop)ispA(RBS4stop)idi(N31)t7t IP5 (N21)aadA(N22)p5(N23)laclT7(RBS1)dxsACAY_ggatctcgacgctctccctt N31-Brr Over 100 6 out of 6(RBS2stop)ispA(RBS3stop)valC (RBS4stop)idi(N31)t7t IP6(N21)aadA(N22)p5(N23)laclT7(RBS1)dxs ACAY_ggatctcgacgctctccctt N31-BrrOver 100 6 out of 6 (RBS2stop)ispA(RBS3stop)idi (RBS4stop)valC(N31)t7tIP7 (N21)aadA(N22)p5(N23)laclT7(RBS1)idi ACAY_ggatctcgacgctctcccttN31-Brr Over 100 6 out of 6 (RBS2stop)dxs(RBS3stop)valC(RBS4stop)ispA(N31)t7t IP8 (N21)aadA(N22)p5(N23)laclT7(RBS1)idi G-idi FN31-Brr Over 100 5 out of 6 (RBS2stop)dxs(RBS3stop)ispA(RBS4stop)valC(N31)t7t IP9 (N21)aadA(N22)p5(N23)laclT7(RBS1)idiACAY_ggatctcgacgctctccctt N31-Brr Over 100 6 out of 6(RBS2stop)valC(RBS3stop)dxs (RBS4stop)ispA(N31)t7t IP10(N21)aadA(N22)p5(N23)laclT7(RBS1)idi G-idi F N31-Brr Over 100 5 out of 6(RBS2stop)valC(RBS3stop)ispA (RBS4stop)dxs(N31)t7t IP11(N21)aadA(N22)p5(N23)laclT7(RBS1)idi G-idi F ValC- Over 100 5 out of 6(RBS2stop)ispA(RBS3stop)dxs R_tgtctcggatctcgtcgatc(RBS4stop)valC(N31)t7t IP12 (N21)aadA(N22)p5(N23)laclT7(RBS1)idi G-idi FN31-Brr Over 100 5 out of 6 (RBS2stop)ispA(RBS3stop)valC(RBS4stop)dxs(N31)t7t IP13 (N21)aadA(N22)p5(N23)laclT7(RBS1)valC ValC-N31-Brr Over 100 6 out of 6 (RBS2stop)idi(RBS3stop)dxsF_gaactacctgaagatccgaa (RBS4stop)ispA(N31)t7t IP14(N21)aadA(N22)p5(N23)laclT7(RBS1)valC ValC- N31-Brr Over 100 4 out of 6(RBS2stop)idi(RBS3stop)ispA F_gaactacctgaagatccgaa (RBS4stop)dxs(N31)t7tIP15 (N21)aadA(N22)p5(N23)laclT7(RBS1)valC ValC- N31-Brr Over 100 6 outof 6 (RBS2stop)dxs(RBS3stop)idi F_gaactacctgaagatccgaa(RBS4stop)ispA(N31)t7t IP16 (N21)aadA(N22)p5(N23)laclT7(RBS1)valCACAY_ggatctcgacgctctccctt N31-Brr Over 100 6 out of 6(RBS2stop)dxs(RBS3stop)ispA RBS4stop)idi(N31)t7t IP17(N21)aadA(N22)p5(N23)laclT7(RBS1)valC ValC- N31-Brr Over 100 5 out of 6(RBS2stop)ispA(RBS3stop)dxs F_gaactacctgaagatccgaa (RBS4stop)idi(N31)t7tIP18 (N21)aadA(N22)p5(N23)laclT7(RBS1)valC ValC- N31-Brr Over 100 3 outof 6 (RBS2stop)ispA(RBS3stop)idi F_gaactacctgaagatccgaa(RBS4stop)dxs(N31)t7t IP19 (N21)aadA(N22)p5(N23)laclT7(RBS1)ispA ispA-N31-Brr Over 100 5 out of 6 (RBS2stop)idi(RB3stop)dxsF_atgacatcctggatgtggtg (RBS4stop)valC(N31)t7t IP20(N21)aadA(N22)p5(N23)laclT7(RBS1)ispA ACAY_ggatctcgacgctctccctt N31-BrrOver 100 6 out of 6 (RBS2stop)idi(RBS3stop)valC (RBS4stop)dxs(N31)t7tIP21 (N21)aadA(N22)p5(N23)laclT7(RBS1)ispA ACAY_ggatctcgacgctctcccttN31-Brr Over 100 6 out of 6 (RBS2stop)dxs(RBS3stop)valC(RBS4stop)idi(N31)t7t IP22 (N21)aadA(N22)p5(N23)laclT7(RBS1)ispA ispA-N31-Brr Over 100 6 out of 6 (RBS2stop)dxs(RBS3stop)idiF_atgacatcctggatgtggtg (RBS4stop)valC(N31)t7t IP23(N21)aadA(N22)p5(N23)laclT7(RBS1)ispA ispA- N31-Brr Over 100 4 out of 6(RBS2stop)valC(RBS3stop)dxs F_atgacatcctggatgtggtg (RBS4stop)idi(N31)t7tIP24 (N21)aadA(N22)p5(N23)laclT7(RBS1)ispA ValC- N31-Brr Over 100 5 outof 6 (RBS2stop)valC(RBS3stop)idi F_gaactacctgaagatccgaa(RBS4stop)dxs(N31)t7t TP1 (N21)aadA(N22)p5(N23)laclT7 RBS1-Bff N31-BrrOver 100 5 out of 6 (RBSB1)aroG mutant (RBS4stop)tyrAmutant(N31)t7t(N21) TP2 (N21)aadA(N22)p15(N23)laclT7 RBS1-Bff N31-BrrOver 100 5 out of 6 (RBSB1)aroG mutant (RBS4stop)tyrAmutant(N31)t7t(N21) TP3 (N21)aadA(N22)pMB1(N23)laclT7 RBS1-Bff N31-BrrOver 100 6 out of 6 (RBSB1)aroG mutant (RBS4stop)tyrAmutant(N31)t7t(N21) TP4 (N21)aadA(N22)pMB1 mutant RBS1-Bff N31-Brr Over100 5 out of 6 (N23)laclT7(RBSB1)aroG mutant (RBS4stop)tyrAmutant(N31)t7t(N21) TP5 (N21)aadA(N22)p5(N23)pLac G-pLac F t7t-T R 20 to30 3 out of 6 (RBSB1)aroG mutant RBS4stop)tyrAmutant(N31)t7t(LK3)Lacl(N21) TP6 (N21)aadA(N22)p15(N23)pLac G-pLac Ft7t-T R 50 to 100 6 out of 6 (RBSB1)aroG mutant (RBS4stop)tyrAmutant(N31)t7t(LK3)Lacl(N21) TP7 (N21)aadA(N22)pMB1(N23)pLac G-pLac Ft7t-T R 50 to 100 6 out of 6 (RBSB1)aroG mutant (RBS4stop)tyrAmutant(N31)t7t(LK3)Lacl(N21) TP8 (N21)aadA(N22)pMB1 mutant(N23)pLacG-pLac F t7t-T R 50 to 100 6 out of 6 (RBSB1)aroG mutant (RBS4stop)tyrAmutant(N31)t7t(LK3)Lacl(N21) TP9 (N21)aadA(N22)p5(N23)laclT7 RBS1-BffN31-Brr Over 100 6 out of 6 (RBSB1)tyrA mutant (RBS4stop)aroGmutant(N31)t7t(N21) TP10 (N21)aadA(N22)p15(N23)laclT7 RBS1-Bff N31-BrrOver 100 4 out of 6 (RBSB1)tyrA mutant (RBS4stop)aroGmutant(N31)t7t(N21) TP11 (N21)aadA(N22)pMB1(N23)laclT7 RBS1-Bff N31-BrrOver 100 6 out of 6 (RBSB1)tyrA mutant (RBS4stop)aroGmutant(N31)t7t(N21) TP12 (N21)aadA(N22)pMB1 mutant RBS1-Bff N31-Brr Over100 6 out of 6 (N23)laclT7(RBSB1)tyrA mutant (RBS4stop)aroGmutant(N31)t7t(N21) TP13 (N21)aadA(N22)p5(N23)pLac G-pLac F t7t-T R 10to 20 6 out of 6 (RBSB1)tyrA mutant(RBS4stop)aroG mutant(N31)t7t(LK3)Lacl(N21) TP14 (N21)aadA(N22)p15(N23)pLac G-pLac F t7t-T R50 to 100 4 out of 6 (RBSB1)tyrA mutant(RBS4stop)aroG mutant(N31)t7t(LK3)Lacl(N21) TP15 (N21)aadA(N22)pMB1(N23)pLac G-pLac F t7t-T R50 to 100 6 out of 6 (RBSB1)tyrA mutant(RBS4stop)aroG mutant(N31)t7t(LK3)Lacl(N21) TP16 (N21)aadA(N22)pMB1 mutant(N23)pLac G-pLac Ft7t-T R 50 to 100 6 out of 6 (RBSB1)tyrA mutant(RBS4stop)aroG mutant(N31)t7t(LK3)Lacl(N21)

TABLE 12 Transformed strains and introduced plasmids Name Strainsgenotype Plasmids IPS1 MG1655_ΔrecA_ΔendA_DE3 IP1 IPS2MG1655_ΔrecA_ΔendA_DE3 IP2 IPS3 MG1655_ΔrecA_ΔendA_DE3 IP3 IPS4MG1655_ΔrecA_ΔendA_DE3 IP4 IPS5 MG1655_ΔrecA_ΔendA_DE3 IP5 IPS6MG1655_ΔrecA_ΔendA_DE3 IP6 IPS7 MG1655_ΔrecA_ΔendA_DE3 IP7 IPS8MG1655_ΔrecA_ΔendA_DE3 IP8 IPS9 MG1655_ΔrecA_ΔendA_DE3 IP9 IPS10MG1655_ΔrecA_ΔendA_DE3 IP10 IPS11 MG1655_ΔrecA_ΔendA_DE3 IP11 IPS12MG1655_ΔrecA_ΔendA_DE3 IP12 IPS13 MG1655_ΔrecA_ΔendA_DE3 IP13 IPS14MG1655_ΔrecA_ΔendA_DE3 IP14 IPS15 MG1655_ΔrecA_ΔendA_DE3 IP15 IPS16MG1655_ΔrecA_ΔendA_DE3 IP16 IPS17 MG1655_ΔrecA_ΔendA_DE3 IP17 IPS18MG1655_ΔrecA_ΔendA_DE3 IP18 IPS19 MG1655_ΔrecA_ΔendA_DE3 IP19 IPS20MG1655_ΔrecA_ΔendA_DE3 IP20 IPS21 MG1655_ΔrecA_ΔendA_DE3 IP21 IPS22MG1655_ΔrecA_ΔendA_DE3 IP22 IPS23 MG1655_ΔrecA_ΔendA_DE3 IP23 IPS24MG1655_ΔrecA_ΔendA_DE3 IP24 TPS1 MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 TP1TPS2 MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 TP2 TPS3MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 TP3 TPS4MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 TP4 TPS5MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 TP5 TPS6MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 TP6 TPS7MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 TP7 TPS8MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 TP8 TPS9MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 TP9 TPS10MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 TP10 TPS11MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 TP11 TPS12MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 TP12 TPS13MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 TP13 TPS14MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 TP14 TPS15MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 TP15 TPS16MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 TP16

TABLE 13Transformed strains and introduced plasmids with PCR primers used for colony PCRColony PCR E. coli strains with Plasmids forward andexpected genotype modifications used reverse oligos SequenceMG1655_ΔrecA_ΔendA_ΔaslA_DE3 GE1 aslA screeningTGGAACAACAGGCATGGATT (SEQ ID NO: 221)/ 4F/4RACAGGCGAAATATGGTGCT (SEQ ID NO: 222) MG1655_ΔrecA_ΔendA_ΔaslA_DE3 GE2aslA screening TGGAACAACAGGCATGGATT (SEQ ID NO: 221/ 4F/4RACAGGCGAAATATGGTGCT (SEQ ID NO: 222) MG1655_ΔrecA_ΔendA_ΔnupG_DE3 GE3nupG screening GGAAATATGGCGTTGATGAG (SEQ ID NO: 223/ 2F/2RAGGATTATCCGACATCAGTG (SEQ ID NO: 224) MG1655_ΔrecA_ΔendA_ΔpheA_DE3 GE4pheA screening TCATCAAATATGGCTCGCTT (SEQ ID NO: 225)/ F/RTCGAGCGGCTGATATTGTTG (SEQ ID NO: 226) MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3GE5 tyrR screening AACGCTGGTATGCCTCAATC (SEQ ID NO: 227)/ F/RAGGCTTCCTCGAATACCTTA (SEQ ID NO: 228) MG1655_ΔrecA_ΔendA_ΔnupG_DE3 GE6nupG screening TATTGTGCCTATGTGGCTTC (SEQ ID NO: 229)/ 4F/4RCGAATAAAGTGGTGACGAATG (SEQ ID NO: 230) MG1655_ΔrecA_ΔendA_ΔnupG_DE3 GE7nupG screening TATTGTGCCTATGTGGCTTC (SEQ ID NO: 229)/ 4F/4RCGAATAAAGTGGTGACGAATG (SEQ ID NO: 230) MG1655_ΔrecA_ΔendA_ΔmelB_DE3 GE8melB screening GTAAGCGGCATGGTCTGGAAC (SEQ ID NO: 231)/ 2F/2RGCAGGCCGTATGGACTCCTA (SEQ ID NO: 232) MG1655_ΔrecA_ΔendA_ΔrcsB_DE3 GE9rcsB screening AACTGGCGAATCAGGCAGA (SEQ ID NO: 233)/ 3F/3RGCGATTATCTCTCTATCCGT (SEQ ID NO: 234)MG1655_ΔrecA_ΔendA_ΔptsH_ptsl_crr_DE3 GE10 ptsH/I_crrAGACCGATCTTATCTCTGTC (SEQ ID NO: 235)/ screening F/RTAGTGTAATGACCAGACAAA (SEQ ID NO: 236)

Cell Culture, GCMS and HPLC Measurement

Valencene-producing E. coli will be screened in test tubes. Singlecolony will be inoculated into LB medium, and cultured overnight at 37°C. and 250 rpm. The overnight grown cell suspension was diluted 100-foldby using K3 medium²⁵, and cultured at 30° C. and 250 rpm until celldensity reached 0.5-1.0 (OD600), at which proper amount of inducers(IPTG, 100 mM) was added to final concentration to be 0, 0.005 and 0.1mM. One milliliter of the induced cells will be transferred to a 14 mLround-bottom falcon tube, and 200 microliters of dodecane was added. Thetube will be incubated at 30° C. and 250 rpm for 72 h. Spectinomycin wasadded to both seed culture to be 50 microgram per milliliter in seedmedium (LB) and culture medium (K3 medium). At the end of incubation, 8microliters of dodecane phase will be drawn and diluted with 800microliters of ethyl acetate. One microliter of the mixture will beinjected into GCMS for analysis of valencene. GCMS measurementcondition: Agilent HP-5 ms column. The program for valencene is 100° C.for 1 min, ramping up to 190° C. at 110° C. per minute, ramping up to220° C. at 5° C. per minute, ramping up to 280° C. at 60° C. per minute,and hold it for 2 minutes. Helium is used as carrier gas at 1 milliliterper minute. Mass spectrometry is operated at scan mode (40-400 m/z).Quantification of valencene with calibration was done by usingstandards.

Tyrosine-producing E. coli will be screened in test tubes. Single colonywill be inoculated into LB medium, and cultured overnight at 37° C. and250 rpm. The overnight grown cell suspension was diluted 100-fold byusing K3 medium, and cultured at 30° C. and 250 rpm until cell densityreached 0.5-1.0 (OD600), at which proper inducers (IPTG, 100 mM) wasadded to final concentration to be 0.1 mM. One milliliter of the inducedcells will be transferred to a 14 mL round-bottom falcon tube. The tubewill be incubated at 30° C. and 250 rpm for 84 h. Spectinomycin wasadded to both seed culture to be 50 microgram per milliliter in seedmedium (LB) and culture medium (K3 medium). At the end of incubation,100 microliters of 6 M HCl will be added to 1 milliliter cell culturebroth to dissolve precipitation of tyrosine at 37° C. and 250 rpm for 30minutes, then 150 microliters obtained cell suspension was diluted with450 microliters of 0.1 M HCl, and centrifuged at 12,000 rpm for 5minutes. 2 microliter of the supernatant will be injected into HPLC foranalysis of tyrosine. HPLC measurement condition: Agilent C18 column.The mobile phase is 10% (v/v) of acetonitrile and 90% (v/v) of 0.1%(v/v) trifluoroacetic acid, a flow rate of mobile phase is 0.4milliliter per minute and column temperature is set at 30° C., and runfor 15 minutes. Detector sets as UV absorbance at 254 nm.

Genome Editing of E. coli

Based on a two-plasmid CRISPR system for E. coli ¹³, the procedure fromreported method was further simplified. An isolated colony from plate,which was prepared from glycerol stock of E. coli MG1655_DE3 cellcarrying plasmid of pCAS, was inoculated to 5 milliliters of LB at 30°C. and 200 rpm for overnight, and 50 microgram per milliliter kanamycinwas added to maintain the plasmid of pCAS. Transfer 100 microlitersovernight grown cell suspension to 10 milliliters LB with 150 mML-arabinose and 50 microgram per milliliter kanamycin at 30° C. and 200rpm to OD600 of 0.55 (approximately 2.5-3 h), then spin at 6,000 rpm for10 minutes at room temperature. Resuspend the cell pellet with 1milliliter of ice-cold ultrapure water, and transfer to a chilled 1.5milliliters Eppendorf tube and incubated at ice for 5 minutes. Spin at10,000 g for 15 seconds at room temperature, resuspend and wash thecells twice with 1 milliliter of ice-cold ultrapure water. Resuspend thecell pellet in a final volume of 100 microliters as electrocompetentcells. Mix 100 to 200 nanogram of plasmids with 50 microliters ofelectrocompetent cells, and chill on ice for 5 minutes. Electroporationwas operated at 1.8 kV. Immediately add 600 microliters of SOC mediumafter electroporation, and transfer the cells to a 1.7 millilitersEppendorf tube. Incubate cell resuspension at 30° C. for 2 hours. Spreadall recovered cells onto LB plate containing double antibiotics (100microgram per milliliter ampicillin plus 50 microgram per milliliterkanamycin, or 50 microgram per milliliter spectinomycin plus 50microgram per milliliter kanamycin), then incubated at 30° C. for 48hours. Colony PCR was performed to evaluate the efficiency of genedeletion and insertion, and all used strains and oligos are list inTables 12 and 13.

Example 4: Discussion

A critical step in adding barcodes to sides of fragment is to ensurethat any standardized barcode can be added to side of any standardizedfragment. Theoretically, blunt end ligation could be used to add twobarcodes to sides of any fragment, but it may be difficult to control towhich side of fragment a barcode will be added. As a result, in anexisting DNA assembly standard (BASIC⁸), barcodes were added tofragments by using sticky end ligation to gain the specificity. However,use of sticky end based ligation required both barcodes and fragments tohave conserved sticky end sequences on their sides, to ensurecompatibility between any barcode-fragment pair. And, such conservedsequences remained in the final construct and became scars, which may be4-6 nt long. Such scar would result in extra amino in protein if itexists in protein-coding sequence; it would also make function of nearbyBPs to be less predictable when it is in non-coding sequence⁹.

With sticky end ligation, the traditional DNA assembly is dependent uponrestriction enzyme digestion to generate SE (BioBricks assemblyl⁴,Golden gate¹⁵ and BASIC⁷)—their length is usually 4 nt—resulting in anat least 4 bp scar left between BPs. These scars would introduce extraamino acids to proteins' and extra nucleotides to transcriptionregulatory regions¹⁶. For example, BASIC method⁷ leaves scar ‘GTCC’ infront of, and scar ‘GGCTCG’ behind, protein-coding sequence (FIG. 9),i.e. an extra serine will be added to the N terminus (‘TCC’ encodesserine), and an extra glycine and serine will be added to the C terminus(‘GGC’ and ‘TCG’ encode glycine and serine respectively). Recently, anautomatic design algorithm for DNA assembly, called J5, was invented toreduce scar in combinatorial DNA assembly⁶, in which several DNAassembly methods were combined systematically, but it still left scar infinal construct and also needed to avoid forbidden sequence, becausethey used type II restriction enzymes. Here, the UDS BPs and theinnovative barcoding method has managed to avoid scars in most cases.

An important feature of UDS is that a new fragment formed by assembly ofexisting fragments and barcodes (we term them as composite fragments)can be readily used in a new round of plasmid construction (FIG. 1). Infact, they can just be treated as regular fragments, providing maximalflexibility in reusing composite fragments. On the contrary, existingmulti-tier standards always have strict hierarchy: there are a few tiersof parts and parts in a tier can only be assembled with the ones in thesame tier¹⁷.

We foresee the method developed here to be a powerful tool inbiotechnology. With a small library of UDS BPs, we have demonstratedthree applications and the possibility of creating versatile plasmidsfrom standardized BPs. The impact of UDS will become much larger whenthe size of UDS library increases and the population of researchersusing it grows. UDS fragments, once barcoded, can also be assembled byany assembly method, including but not limited to SLIC¹⁸,

Gibson¹⁹, USER²⁰, MODAL and DNA assembler²². This feature reduces theactivation energy for researchers to adopt UDS BPs, because they cancontinue to use the DNA assembly method they are familiar with.

Example 5: The Three Rules and Workflow of GTas

In this example, GTas has three rules: (1) any DNA sequence longer than35 nucleotides (nt), starting with ‘G’ and ending with ‘T’ can bedefined as a fragment; (2) any DNA sequence longer than 20 nt andshorter than 80 nt can be defined as a barcode; (3) in plasmidconstruction any fragment must be placed after a barcode, and anybarcode must be placed after a fragment. Because the first two rules arevery easy to satisfy, most functional DNA sequences can be defined asfragment and/or barcode (examples are provided in FIG. 10a ). Toconstruct a plasmid, each end of a fragment is first connected to halfof a barcode, which has a complementary half that is connected toanother fragment. The barcoded fragments are then assembled into aplasmid in a specific order based on pairing of barcode halves (FIG. 10a). This workflow always satisfies the third rule.

A basic requirement of any standard system is compatibility. In thiscontext, it means any barcode can be placed after any fragment, whichenables arrangement of fragments and barcodes in any order as long asthe third rule is met. To have such compatibility, one has to conservenucleotides at fragment ends to ensure their standard connections to anybarcodes. A key innovation of this work is that we managed to minimizethe length of conserved sequence at each end to be one nt, which is theminimal length of any conserved sequence.

Having longer conserved sequence constrains flexible use of fragments.For example, if a fragment is a protein coding sequence and theconserved sequence at its 3′ end is ‘TAG’ (a stop codon), it isimpossible to fuse a fluorescence protein to its C-terminus to study itscellular localization, because translation terminates at the stop codon;in GTas, the conserved sequence at 3′ end is ‘T’, which can be used tocreate 16 codons (including stop and nonstop codons) by connectingdifferent barcodes to this end (FIG. 10b ). This example also explainswhy we selected ‘T’ as conserved sequence at 3′ end of fragment.

Similarly, we chose ‘G’ as conserved sequence at 5′ end of fragment,because it can be used to encode ‘ATG’ (a start codon) or another 15codons that end with ‘G’ (FIG. 10b ). An experimental validation of thisflexibility is provided in FIG. 16. Because of this flexibility, GTasalmost eliminates scars, which often interfere with function of DNAparts by adding extra, undesired codons to coding sequence and/orincurring undesired DNA-protein interaction³¹.

To connect halves of two barcodes to designated ends of a fragment (thisprocess is defined as barcoding), one needs to use DNA ligationtechniques based on sticky DNA ends, which on fragment are derived fromthe conserved DNA sequences, and which on barcodes are added as standardconnectors (FIG. 10c ). Since the conserved sequence length is one nt inGTas, the corresponding sticky end length is one nt. A major challengewe faced was that DNA ligation based on such short sticky ends wasinefficient, if we used the conventional method, in which two oligoswere annealed to form one barcode half with a one nt overhang (FIG. 11aand FIG. 17). We assessed ligation efficiency by using PCR: we attemptedto amplify the barcoded fragments from the ligation product by using twooligos that only bind the barcode halves—PCR would only be successful ifenough sticky ends are ligated. In a test run with barcoding fivefragments (FIG. 11e ), although correct PCR products were observed onagarose gel after electrophoresis, undesired PCR products (in the formof smear) were also observed (FIG. 11c ). When we further attempted toassemble the five barcoded fragments (each of which was excised from thegel to remove as much undesired products as possible) into a plasmid, weonly obtained a few colonies and sequencing the plasmids they containedalso revealed that a large fraction of the plasmids had duplication ofsome barcode halves (FIGS. 11f and 11g ). We tried the same procedurewith another two plasmids, each of which was also assembled from fivefragments, and we obtained similar, negative results (FIG. 11g and FIG.18b ).

The duplication of barcode halves suggested that two barcode halves wereligated to one side of some fragments in a tandem manner. Wehypothesized that if we sealed the blunt end of each barcode halve byconnecting the two strands with a few extra nucleotides, this problemcould be solved, because the barcoded fragments would be circular DNAmolecule, leaving no end for additional barcode halve to attach (FIG.11b ). Essentially, each barcode half would be created by using oneoligo that has a stem-loop secondary structure (such oligo is termed asBoligo, barcoding oligo). With this new design, we repeated constructionof the same plasmids. We found that most undesired PCR products wereeliminated (FIG. 11d ). More importantly, the number of colonies weobtained from the subsequent plasmid construction was ˜100 fold higherthan that from the conventional method (FIG. 11f ), and the plasmidsthese colonies contained were confirmed to be correct by sequencing inall our tests (FIG. 11g and FIG. 18b ).

In this workflow, fragment was usually generated by amplifying atemplate DNA by using two oligos that had phosphorothioate (PS) bondafter their first nucleotide at 5′ end (FIG. 10c and FIG. 11b , designprinciple: FIG. 12f ). The PS bonds were incorporated into fragmentafter PCR and can be cleaved by using a simple chemical reaction togenerate the one nt sticky ends (FIG. 10c ). The oligo used in this stepis termed as Foligo (fragment-creating oligo). Fragments less than 90 ntcan also be generated by annealing two oligos, and this process alsoadded the sticky ends to fragments because of the oligo design (FIG.19). These oligos are termed as Noligos (Non-modified oligos forcreating fragments). Barcoded fragments can be activated by introducingPS bonds to specific locations of barcodes through PCR (FIG. 10c , thisis the PCR that we used to assess ligation efficiency). PS oligos areneeded in this process, but they are standard parts—each oligo isassociated with a barcode and can be used to amplify any fragmentflanked by the barcode. Chemically cleaving the PS bonds in the PCRproducts generates long sticky ends (15 to 20 nt), which can beaccurately annealed and ligated at elevated temperature by usingthermophilic ligase (FIG. 10c and FIG. 20). Such PS oligo is termed asAoligo (Assembling oligo, FIG. 12c ).

Example 6: Construction of Plasmids

We have successfully constructed 370 plasmids (P1 to P370) by using thisworkflow (FIG. 21) for various projects from an expanding library offragments and barcodes (Table 14 and 15). The accuracy was 86% based onSanger sequencing, and the accuracy did not substantially change whenplasmid length (2.43-13.04 kb) and the number of fragments used inplasmid construction (2-7) varied (FIG. 10d ). This large set ofvalidation data proved that this workflow of GTas is reliable androbust. In comparison, the averaged accuracy of BASIC method was only50% when six fragments were assembled and one antibiotic was used⁷.

TABLE 14 List of fragments, Foligos and Noligos used in this study SNForward Forward Foligo sequence Reverse Reverse Foligo sequence Foli-Frag- Foligo (*: phosphorothioate Foligo (*: phosphorothioate) Tem- gosment name bond) name bond) plates Annotation  1 gRNA- G-gRNA-G*tacgagttaatcaatatcaca gRNA-T R A*tctagagaattcaaaaaaag pTarget guidenupG nupG F ttttagagctagaaatag (SEQ ID NO: 2) from RNA_Spacer(SEQ ID NO: 1) Addgene is indicaed as bold  2 gRNA- G-gRNA-G*tgcagaacttgagaaaaaaac gRNA-T R A*tctagagaattcaaaaaaag pTarget guideaslA as A F ttttagagctagaaatag (SEQ ID NO: 2) from RNA_Spacer(SEQ ID NO: 3) Addgene is indicaed as bold  3 gRNA- G-gRNA-G*tctaccatttgttaattatgt gRNA-T R A*tctagagaattcaaaaaaag pTarget guidemelB me B F ttttagagctagaaatag (SEQ ID NO: 2) from RNA_Spacer(SEQ ID NO: 4) Addgene is indicaed as bold  4 gRNA- G-gRNA-G*taatcacttgagcaaattgag gRNA-T R A*tctagagaattcaaaaaaag pTarget guidercsB rcsB F ttttagagctagaaatag (SEQ ID NO: 2) from RNA_Spacer(SEQ ID NO: 5) Addgene is indicaed as bold  5 gRNA- G-gRNA-G*tttaataccgagcgttcaaaa gRNA-T R A*tctagagaattcaaaaaaag pTarget guidetyrR tyrR F ttttagagctagaaatag (SEQ ID NO: 2) from RNA_Spacer(SEQ ID NO: 6) Addgene is indicaed as bold  6 gRNA- G-gRNA-G*ttttgagcaattcattgaaag gRNA-T R A*tctagagaattcaaaaaaag pTarget guidepheA pheA F ttttagagctagaaatag (SEQ ID NO: 2) from RNA_Spacer(SEQ ID NO: 7) Addgene is indicaed as bold  7 gRNA- G-gRNA-G*tgaagttgatttctttagtat gRNA-T R A*tctagagaattcaaaaaaag pTarget guideptsl ptsI F ttttagagctagaaatag (SEQ ID NO: 2) from RNA_Spacer(SEQ ID NO: 8) Addgene is indicaed as bold  8 gRNA- G-gRNA-G*tataatactttgtcgatttga gRNA-T R A*tctagagaattcaaaaaaag pTarget guidexylB xylB F ttttagagctagaaatag (SEQ ID NO: 2) from RNA_Spacer(SEQ ID NO: 237) Addgene as bold  9 gRNA- G-gRNA-G*ttttaacgatatcgatacctt gRNA-T R A*tctagagaattcaaaaaaag pTarget guidemanZ manZ ttttagagctagaaatag (SEQ ID NO: 2) from RNA_Spacer(SEQ ID NO: 238) Addgene is indicaed as bold 10 gRNA- G-gRNA-G*taatcgttaataatttccaga gRNA-T R A*tctagagaattcaaaaaaag pTarget guideglk glk F ttttagagctagaaatag (SEQ ID NO: 2) from RNA_Spacer(SEQ ID NO: 239) Addgene is indcaed as bold 11 nupG- G-nupG-G*ttgatcctgccagcaata nupG- A*catcgtgatgcggatgag E. coli Upstream HF0.5HF0.5 F (SEQ ID NO: 1) HF0.5-T R (SEQ ID NO: 10) genomic homologous DNAsequence 12 nupG- G-nupG- G*accatcgccgggacagaacc nupG-Same to nupG-HF0.5-T R E. coli Upstream HF1.0 HF1.0 F (SEQ ID NO: 250)HF1.0-T R genomic homologous DNA sequence 13 nupG- G-nupG-G*tgcaacgtgaagcagaaggt nupG- Same to nupG-HF0.5-T R E. coli UpstreamHF1.5 HF1.5 F SEQ ID NO: 12) HF1.5-T R genomic homologous DNA sequence14 aslA- G-aslA- G*caccgtaaacggctctgc aslA- A*gtttcatgtcatcaaaatgE. coli Upstream HF0.5 HF0.5 F (SEQ ID NO: 13) HF0.5-T R (SEQ ID NO: 14)genomic homologous DNA sequence 15 aslA- G-aslA- G*ccagtacgacgatcgcctaslA- Same to aslA-HF0.5-T R E. coli Upstream HF1.0 HF1.0 F(SEQ ID NO: 15) HF1.0-T R genomic homologous DNA sequence 16 melB-G-melB- G*cccaatggcgatgaatacct melB- A*gctgttaccaacgcccgcct E. coliUpstream HF1.0 HF1.0 (SEQ ID NO: 16) HF1.0-T R (SEQ ID NO: 17) genomichomologous DNA sequence 17 rcsB- G-rcsB- G*gttagcgaacatgcttgcgg rcsB-A*ttgctacagcaagctcttga E. coli Upstream HF1.0 HF1.0 F (SEQ ID NO: 18)HF1.0-T R (SEQ ID NO: 19) genomic homologous DNA sequence 18 tyrR-G-tyrR- G*cagcccgctggcgttggt tyrR- A*gtcagcacccgatattgcat E. coliUpstream HF0.5 HF0.5 F (SEQ ID NO: 20) HF0.5-T R (SEQ ID NO: 21) genomichomologous DNA sequence 19 pheA- G-pheA- G*catgtcgcagaccgtctcg pheA-A*cgaaacgcctcccattcag E. coli Upstream HF0.5 HF0.5 F (SEQ ID NO: 22)HF0.5-T R (SEQ ID NO: 23) genomic homologous DNA sequence 20 ptsI-G-ptsI- G*gcccgcataaaattcaggg ptsI-HF0.7- A*ggaactaaagtctagcctgg E. coliUpstream HF0.7 HF0.7 F (SEQ ID NO: 24) T R (SEQ ID NO: 25) genomichomologous DNA sequence 21 manZ- G-manZ- G*cgggccaggtactgaccatc manZ-A*tagtccagttcgttatcgag E. coli Upstream HF0.5 HF0.5 F (SEQ ID NO: 240)HF0.5-T R (SEQ ID NO: 241) genomic homologous DNA sequence 22 glk-G-glk- G*cgcagagggcggaaccggtg glk-HF0.5- A*cgctaaagtcaaaataattc E. coliUpstream HF0.5 HF0.5 F (SEQ ID NO: 242) T R (SEQ ID NO: 243) genomichomologous DNA sequence 23 xylB- G-xylB- G*cgtggcgatgcgcaactg xylB-A*agatctatcccgatatacat E. coli Upstream HF0.5 HF0.5 F (SEQ ID NO: 244)HF0.5-T R (SEQ ID NO: 245) genomic homologous DNA sequence 24 nupG-G-nupG- G*ttacgcaaagaaaaacgg nupG- A*gccgctggttgaggtgtt E. coliDownstream HT0.5 HT0.5 F (SEQ ID NO: 26) HT0.5-T R (SEQ ID NO: 27)genomic homologous DNA sequence 25 nupG- G-nupG- Same to G-nupG-HT0.5 FnupG- A*acgcctttatgctccatgct E. coli Downstream HT1.0 HT1.0 F HT1.0-T R(SEQ ID NO: 28) genomic homologous DNA sequence 26 nupG- G-nupG-Same to G-nupG-HT0.5 F nupG- A*gcgacgccggtctatctgga E. coli DownstreamHT1.5 HT1.5 F HT1.5-T R (SEQ ID NO: 29) genomic homologous DNA sequence27 aslA- G-aslA- G*gccggcgctatcgctgag aslA- A*cactatgtttatccgcaa E. coliDownstream HT0.5 HT0.5 F (SEQ ID NO: 30) HT0.5-T R (SEQ ID NO: 31)genomic homologous DNA sequence 28 aslA- G-aslA- Same to G-aslA-HT0.5 FaslA- A*gcccgcctgagatccaca E. coli Downstream HT1.0 HT1.0 F HT1 .0-T R(SEQ ID NO: 32) genomic homologous DNA sequence 29 melB- G-melB-G*gtgcagtgagtgatgtgaaa melB- A*gggtatggaagctatctgga E. coli DownstreamHT1.0 HT1.0 F (SEQ ID NO: 33) HT1.0-T R (SEQ ID NO: 34) genomichomologous DNA sequence 30 rcsB- G-rcsB- G*tcacctgtaggccagataag rcsB-A*attcagaaccgggaatgggc E. coli Downstream HT1.0 HT1.0 F (SEQ ID NO: 35HT1.0-T R (SEQ ID NO: 36) genomic homologous DNA sequence 31 tyrR-G-tyrR- G*gcgcgaatatgcctgatg tyrR- A*catcccgcaggcgggtag E. coliDownstream HT0.5 HT0.5 F (SEQ ID NO: 37) HT0.5-T R (SEQ ID NO: 38)genomic homologous DNA sequence 32 pheA- G-pheA- G*ttactggcgattgtcattcgpheA- A*aaatgggccattacaggcc E. coli Downstream HT0.5 HT0.5 F(SEQ ID NO: 39) HT0.5-T R (SEQ ID NO: 40) genomic homologous DNAsequence 33 ptsI- G-ptsI- G*agcgcatcacttccagtac ptsI-HT0.7-A*taacgataagagtagggcac E. coli Downstream HT0.7 HT0.7 F (SEQ ID NO: 41)T R (SEQ ID NO: 42) genomic homologous DNA sequence 34 manZ- G-manZ-G*gactgttgtacactaccggg manZ- A*acgagaagcttataaatttt E. coli DownstreamHT0.5 HT0.5 F (SEQ ID NO: 246) HT0.5-T R (SEQ ID NO: 247) genomichomologous DNA sequence 35 glk- G-glk- G*atccttccttttatatcgggglk-HT0.7-T A*gcccgcagcgtttttaattg E. coli Downstream HT0.5 HT0.5 F(SEQ ID NO: 248) R (SEQ ID NO: 249) DNA homologus sequence 36 xylB-G-xylB- G*acgttatcccctgcctga xylB- A*cgaaacaaacgcatttga E. coliDownstream HT0.5 HT0.5 F (SEQ ID NO: 250) HT0.5-T R (SEQ ID NO: 251)genomic homologous DNA sequence 37 SpectR G-spectR G*tcgacctgcagaagcttspectR-T A*cgttaagggattttggt pTarget Antibiotic F (SEQ ID NO: 43) R(SEQ ID NO: 44) from resistence Addgene gene 38 AmpR G-ampR FG*tttctacaaactctttt G-ampR F Same to spectR-T R p5T7- Antibiotic(SEQ ID NO: 45) eGFP resistance from gene this lab 39 pSC101 G-repA/p5G*ccgttttcatctgtgcatat repA/p5-T A*tccttttgtaatactgcgga p5T7-Replication F (SEQ ID NO: 46) R (SEQ ID NO: 47) eGFP origin-low fromcopy number this lab 40 pAC G-p15A F G*tgttcagctactgacgg p15A-T RA*gacatcaccgatgggga pACmini- Replication (SEQ ID NO: 48) (SEQ ID NO: 49)B2B3 origin- from medium copy this lab number 41 pMB1 G-pMB1 FG*agttttcgttccactga pMB1-T R A*ggatccagcatatgcgg pTarget Replication(SEQ ID NO: 50) (SEQ ID NO: 51) from origin- Addgene medium copy number42 pUC G-pUC F G*gctcactcaaaggcggta pUC-T R A*attaccgcctttgagtga pUC19Replication (SEQ ID NO: 52) (SEQ ID NO: 53) from NEB origin-highcopy number 43 pLac G-pLac F G*caacgcaattaatgtgagt pLac-T RA*tttgttatccgctcacaatt pUC19 Promoter (SEQ ID NO: 54) (SEQ ID NO: 55)from NEB 44 pthrC3 G-pthrC3 G*agcttttcattctgactgcaa pthrC3-T RA*ggttgttacctcgttacctt E.coli Promoter F (SEQ ID NO: 252)(SEQ ID NO: 253) genomic DNA 45 LacIPT7 G-lacIT7 G*gaaactacccataatacaagLacIT7-T R A*gaggggaattgttatccgct pET11a- Promoter F (SEQ ID NO: 56)cacaattcccctatagtga eGFP with LacI (SEQ ID NO: 57) from repressorthis lab coding sequence 46 t7t G-t7t F G*ggctgctaacaaagccc t7t-T RA*ggcaccgtcaccctggat pET11a- Terminator (SEQ ID NO: 58) (SEQ ID NO: 59)eGFP from this lab 47 LacI G-lacI F Same to G-lacIT7 F lacI-T RA*tcccggacaccatcgaat pET11a- Coding (SEQ ID NO: 60) eGFP sequence fromthis lab 48 tyrR G-tyrR G*gttgctgaattgaccgcatt tyrR A*ctggcgattgtcattcgcpACmini- Coding mutant mutant F (SEQ ID NO: 69) mutant-T R(SEQ ID NO: 70) tyrR sequence mutant/ met5 3-Ile_ Ala354- Val fromthis lab 49 aroG G-aroG G*aattatcagaacgacgattt aroG A*cccgcgacgcgcttttapACmini- Coding mutant mutant F (SEQ ID NO: 71) mutant-T R(SEQ ID NO: 72) aroG sequence mutant/ aroG Asp146- Asn from this lab 50tal G-tal F G*acccaggttgttgaacgtca tal-T R A*gccaaaatctttaccatcSynthetic Coding (SEQ ID NO: 254) tgcttc DNA from sequence(SEQ ID NO: 255) IDT 51 eGFP G-eGFP F G*agcaagggcgaggagctgtt eGFP-T R3A*cttgtacagctcgtccatgc pET11a- Coding (SEQ ID NO: 256) (SEQ ID NO: 257)eGFP from sequence this lab 52 ppsA G-ppsA F G*tccaacaatggctcgtcappsA-T R A*ttatttcttcagttcagcca E. coli Coding (SEQ ID NO: 258) gggenomic sequence (SEQ ID NO: 259) DNA 53 tktA G-tktA FG*tcctcacgtaaagagcttg tktA-T R A*cagcagttcttttgctttc E. coli Coding(SEQ ID NO: 260) (SEQ ID NO: 261) DNA 54 aroE G-aroE FG*gaaacctatgctgtttttg aroE-T R A*cgcggacaattcctcctgc E. coli Coding gta(SEQ ID NO: 263) genomic sequence (SEQ ID NO: 262) DNA SN_ Noli- Frag-Forward Reverse gos ment Noligo Sequence Noligo Sequence SourceAnnotation  1 CelN20- CelN20- Gagggtaatacacgcgaagac CelN20-gcgcttaacgttgtcgttaccc Synthetic E. coli Noligo F aactttaagcacttattgggtNoligo R aataagtgcttaaagttgtctt oligo secretion aacgacaacgttaagcgctcgcgtgtattaccctcc from IDT signal (SEQ ID NO: 264) (SEQ ID NO: 265)peptide  2 AS AS- agtcgacctgcaggtatgtta AS-Noligo gtagtccatattaacatacctgSynthetic Antisense Noligo F atatggactact R caggtcgactc oligo RNA in(SEQ ID NO: 266) (SEQ ID NO: 267) from IDT yeast

TABLE 15 List of barcode in our GTas-based library SN Barcode LeftSticky end Right Complete sequence Annotation  1 N21 tc cctcgactcacacttgg tccctcgactcacacttgg Non-functional (SEQ ID NO: 268) (SEQ ID NO: 269) 2 N22 tc Acacaacatagccac gg tcacacaacatagccacgg Non-functional(SEQ ID NO: 270) (SEQ ID NO: 271)  3 N23 t caaacagaaaggccat ggtcaaacagaaaggccatgg Non-functional (SEQ ID NO: 272) (SEQ ID NO: 273)  4N24 tc ctcatggattctacg gg tcctcatggattctacggg Non-functional(SEQ ID NO: 274) (SEQ ID NO: 275)  5 3UTR1 ga tgggctgaagggttt aagatgggctgaagggtttaa 3' end untranslated (SEQ ID NO: 276)(SEQ ID NO: 277) region with stop codon  6 N32 gt cccatcacagcttac aagtcccatcacagcttacaa Non-functional (SEQ ID NO: 278) (SEQ ID NO: 279)  7N36 ac taagggcgaatgtac ag actaagggcgaatgtacag Non-functional(SEQ ID NO: 280) (SEQ ID NO: 281)  8 N37 aa ctttgggtcatgtgc tgaactttgggtcatgtgctg 3' end untranslated (SEQ ID NO: 282)(SEQ ID NO: 283) region with stop codon  9 N38 tc actaaaccacagcct actcactaaaccacagcctac Non-functional (SEQ ID NO: 284) (SEQ ID NO: 285) 10N39 ac gtgggacatctgttt cg acgtgggacatctgtttcg Non-functional(SEQ ID NO: 286) (SEQ ID NO: 287) 11 N40 gtgc gcggaacccctatttgtgcgcggaacccctattt Non-functional (SEQ ID NO: 288) (SEQ ID NO: 289) 12pJ23119 tgacagc tagctcagtcctagg tataatacta tgacagctagctcagtcctaggPromtor_E. coli (SEQ ID NO: 290) (SEQ ID tataatacta NO: 291)(SEQ ID NO: 292) 13 SRBS2 aaccg ttcatttatcacaaa ttgttcgatAaccgttcatttatcacaaaag Ribosome binding agga gattgttcgat site with stop(SEQ ID NO: 293) (SEQ ID NO: 294) codon_E. coli 14 SRBS3 gattcacacaggaaacagct at gattcacacaggaaacagctat Ribosome binding(SEQ ID NO: 295) (SEQ ID NO: 296) site with stop codon_E. coli 15 SRBS4aaattaat caggtgaaggttccc at aaattaattgttcttttttcag Ribosome bindingtgttcttt (SEQ ID NO: 298) gtgaaggttcccat site with stop ttt(SEQ ID NO: 299) codon_E. coli (SEQ ID NO: 297) 16 SRBS5 gactccatttttgggctaacagg aggaggaatt Gactccatacccgttttttggg Ribosome bindingacccgtt (SEQ ID NO: 301) aaccat ctaacaggaggaggaattaaccat site with stop(SEQ ID (SEQ ID (SEQ ID NO: 303) codon_E. coli NO: 300) NO: 302) 17 T2Acaggaagc taacatgcggtgacg tcgaggagaa Caggaagcggagagggcagaggaa2A peptide_Yeast ggagaggg (SEQ ID NO: 305) tcctggtcctgtctgctaacatgcggtgacgtcg cagaggaa gg aggagaatcctggtcctgg gtctgc (SEQ ID(SEQ ID NO: 307) (SEQ ID NO: 306) NO: 304) 18 F2A caggaagctcgaccttctcaagt Tccaaggcgg Caggaagcggagtgaaacagactt 2A peptide_Yeastggagtgaa (SEQ ID NO: 309) gagacgccct tgaatttcgaccttctcaagttgg acagacttggttggagtc cgggagacgtggagtccaaccctg tgaatt c gtcc (SEQ ID (SEQ ID(SEQ ID NO: 311) NO: 308) NO: 310) 19 LK3 ca ggctcgtcttatca ggcaggctcgtcttcttcagg Linker used to (SEQ ID NO: 312) (SEQ ID NO: 313)create fusion protein 20 LK5 cc tcatcgtcaggcacc tc cctcatcgtcaggcacctcLinker used to (SEQ ID NO: 314) (SEQ ID NO: 315) create fusion protein21 LK6 ataag gacaacctgtatttcc aggg ataaggacaacctgtatttccagggLinker used to (SEQ ID NO: 316) (SEQ ID NO: 317) create fusion protein22 LK7 cagc agcaggagcacacat cagcagcaggagcacacat Linker used to(SEQ ID NO: 318) (SEQ ID NO: 319) create fusion protein 23 Shistag ctcatcatcatcaccatc actga ctcatcatcatcaccatcactga histag plus stop(SEQ ID NO: 320) (SEQ ID NO: 321) codon 24 5UTR a gcaatctaatctaattctagaaaa agcaatctaatctaagttttcta 5′ end untranslated gtt at gaaaaatregion_Yeast (SEQ ID NO: 322) (SEQ ID (SEQ ID NO: 324) NO: 323) 25 5UTR2Tcgacggattctagta aaatcat tcgacggattctagtaaaatcat 5′ end untranslated(SEQ ID NO: 325) (SEQ ID NO: 326) region_Yeast 26 5UTR3 tgaacaactatcaaaacacaa tgat tgaacaactatcaaaacacaatgat 5′ end untranslated(SEQ ID NO: 327) (SEQ ID NO: 328) region_Yeast 27 5UTR4 agcaatctaatctaagtt ttaattacaa agcaatctaatctaagttttaatta5′ end untranslated (SEQ ID NO: 329) atctagaat caaatctagaat region_Yeast(SEQ ID (SEQ ID NO: 331) NO: 330) 28 5UTR5 a gcaatctaatctaagttttaattacaa agcaatctaatctaagttttaatta 5′ end untranslated(SEQ ID NO: 332) aat tcaaaa region_Yeast (SEQ ID (SEQ ID NO: 334)NO: 333) 29 5UTR6 a gcaatctaatctaagtt taccccatagcaatctaatctaagtttaccccat 5′ end untranslated (SEQ ID NO: 335)(SEQ ID NO: 336) region_Yeast 30 5UTR7 a gcaatctaatctaagtt taaaatagcaatctaatctaagtttaaaat 5′ end untranslated (SEQ ID NO: 337)(SEQ ID NO: 338) region_Yeast 31 3UTR2 aa catcatcatcaccatca ctaaaaaacatcatcatcaccatcactaaaa 3′ end untranslated (SEQ ID NO: 339)(SEQ ID NO: 340) region_Yeast 32 3UTR3 ctcaaaaat tgatttctgaagaa tgtaatgactcaaaaattgatttctgaagaaga 3′ end untranslated gatt tttgtaatgaregion_Yeast (SEQ ID NO: 341) (SEQ ID NO: 342) 33 3UTR4 ctctcgagcatcatcaccatcac catcatcatt ctctcgagcatcatcaccatcaccat3′ end untranslated (SEQ ID NO: 343) aatga catcattaatga region_Yeast(SEQ ID (SEQ ID NO: 345) NO: 344) 34 3UTR5 gacgg gcaatcgcatcacatt caagagacgggcaatcgcatcacattcaaga 3′ end untranslated (SEQ ID NO: 346)(SEQ ID NO: 347) region_Yeast 35 3UTR6 attag ttatgtcacgcttaca ttcacattagttatgtcacgcttacattcac 3′ end untranslated (SEQ ID NO: 348)(SEQ ID NO: 349) region_Yeast 36 3UTR7 ct catcatcatcatcacc atgctcatcatcatcatcaccatg 3' end untranslated (SEQ ID NO: 350)(SEQ ID NO: 351) region_Yeast 37 ICeuI aact ataacggtcctaa gcgaaaactataacggtcctaaggtagcgaa Homing ggta (SEQ ID NO: 353) endonuclease(SEQ ID NO: 352) cutting site 38 SEC17_L atgtagtagggaaa tcaaaatgtagtagggaaatatatcaaa Yeast intron tata (SEQ ID NO: 355) sequence(SEQ ID NO: 354) 39 SEC17_R t gatatttcccgttgtg ttaatgatatttcccgttgtgttaa Yeast intron (SEQ ID NO: 356) (SEQ ID NO: 357)sequence 40 EFB1_L atgttccgatttag ctttata atgttccgatttagtttactttataYeast intron ttta (SEQ ID NO: 359) sequence (SEQ ID NO: 358) 41 EFB1_Rgttttgtttctcct aaata gttttgtttctccttttaaaata Yeast intron ttta(SEQ ID NO: 361) sequence (SEQ ID NO: 360)

Example 7: Feature of GTas—Flipping Standard Parts in PlasmidConstructions

A unique feature of GTas is that any fragment and barcode can be flippedin plasmid by using standard parts (Boligos and Aoligos). There areeight possible ways to link two fragments through one barcode, as eachfragment and the barcode can be flipped independently (FIG. 12d ). Eachbarcode in GTas has the following structure: L-SE-R (FIG. 12a ). SE isthe region that encodes the long (15 to 20 nt) Sticky End; L and R arethe regions flanking Left and Right side of SE, and both are allowed tobe empty. The two halves of each barcode have the following structures:L-SE and SE-R. Each Boligo encodes one barcode half. Because eachbarcode half may be connected to either end of a fragment, there are twoBoligos for each barcode half, carrying ‘G’ or ‘A’ overhang (FIG. 12b ).Each barcode thus has four Boligos in total. The combinations of Boligosto implement the eight connection types are listed in FIG. 12 d.

We experimentally demonstrated the eight connection types by flippingreplication origin (RO), antibiotic resistance marker (AR) and thebarcode connecting them in a plasmid that can express green fluorescenceprotein (GFP, FIG. 12e ) in Escherichia coli (E. coli). The plasmidswere sequenced to ensure that the desired arrangements were achieved.Interestingly, expression level of GFP varied substantially (up to closeto four-fold change) among the eight constructs, proving the importanceof being able to flip elements of plasmid. The difference could becaused by the presence of known and hidden promoters in AR and RO,collision of DNA polymerase (used for plasmid replication) and RNApolymerase (used for transcription) on plasmid³², or the local structureof RO (it could be affected by flipping the barcode). These hypothesesare good topics for future studies for better controlling proteinexpression. Flipping parts also allows arranging standard parts in waysthat are more complex and are more similar to how DNA parts arenaturally arranged in genome³³, e.g. arranging two operons on sense andantisense strands to avoid transcriptional crosstalk, and placing tworepeated sequences on the two strands to avoid undesired homologousrecombination.

Example 8: Feature of GTas—Flexible Editing Constructed Plasmids

One initial demonstration involved construction of 16 plasmids forengineering E. coli to overproduce tyrosine (FIGS. 13a and 13b ), avaluable aromatic amino acid³⁴. Each plasmid was assembled from sixfragments, which are promoter, gene 1, gene 2, terminator, AR and RO.Sixteen plasmids are full combinations of four RO, two promoters, andtwo operon structures (4×2×2, elaborated in FIGS. 13a and 4c ). Sixbarcodes were used. Three of them were functional—two containingribosomal binding sites (RBSs) and one providing stop codon—and the restmerely served as connectors. The cells harboring the 16 plasmidsexhibited a wide range of abilities in producing tyrosine (FIG. 13c ).These plasmids provide a basis for further plasmid construction.

GTas allows flexible reuse of constructed plasmids. By using standardparts, we can replace fragments in, remove fragments from, and addfragments to any plasmid constructed under GTas (FIGS. 13d, 13e and 13f). Below are three demonstrations. Our lab had developed a shortauto-inducible promoter³⁵ (PthrC3), and we were interested in replacingthe promoter (PT7) of the best performing plasmid (among the 16plasmids; plasmid name: TPP2) with PthrC3. We amplified the wholeplasmid except PT7 by using two Aoligos of the barcodes that flankedPT7, barcoded PthrC3 with the same barcodes, and assembled them into oneplasmid (FIG. 13d ). The new plasmid (TPP17) led to slightly highertyrosine titer than the one with PT7 (FIGS. 13c and 13h ), andsimplified the fermentation process by eliminating addition of inducer(PT7 requires addition of Isopropyl β-D-1-thiogalactopyranoside as aninducer). To test if there is any hidden promoter upstream of PT7, weneeded to remove PT7 from plasmid TPP2 (FIG. 13e ). We amplified theplasmid except PT7 and an adjacent fragment (RO) by using two Aoligos ofthe barcodes that flanked them, barcoded the adjacent fragment (RO) withthe same barcodes, and assembled them (FIG. 13e ). Surprisingly, theconstructed plasmid still led to production of 1 g/L of tyrosine (FIG.13h ), indicating presence of hidden promoter(s) in the upstreamsequence. We had the parental strain without any plasmid as the negativecontrol, which was confirmed not to produce tyrosine (detection limit:61 mg/L). Tyrosine can be deaminated into coumaric acid, precursor ofmany valuable flavonoids^(A12), when gene tal (encoding tyrosineammonia-lyase) was expressed³⁷ (FIG. 13b ). When we added tal into TPP17by using standard oligos as described in FIG. 13 f, E. coli with thisnew plasmid readily produced 22 mg/L of coumaric acid (FIG. 13i ).

As shown in the above mini project, plasmids were often improved throughmultiple rounds of modifications, because plasmid performance needed tobe assessed experimentally and used as feedback to direct the next roundof plasmid construction. New parts and ideas from peers also drive manyresearchers to improve their plasmids in such iterative manner, so beingable to edit constructed plasmids by using standard parts would helpmany researchers to move their project forward faster at lower cost.

Example 9: Feature of GTas—Easily Constructing Plasmid Library forCombinatorial Optimization

If a barcoded fragment in plasmid construction was replaced by a mixtureof fragments that were barcoded in the same way, a mixture of plasmidswould be obtained. If two barcoded fragments were replaced this way, amore diverse mixture of plasmids would be obtained. Such plasmid mixtureis termed as plasmid library, and can be used for combinatorialoptimization of strains. To demonstrate this concept (FIG. 14), a smallplasmid library was built for improving E. coli's ability of producingcoumaric acid (FIG. 14a ). We first constructed six plasmids thatcovered all possible ways of shuffling aroG, tyrA and tal (the threegenes we used for coumaric acid production) in an operon (FIGS. 14a and14b ). The operon together with its promoter and terminator is termed asModule 1 (M1). We further selected another three genes that werereported to improve production of aromatic compounds (tktA, aroE andppsA, FIG. 14b ), and constructed six plasmids to shuffle them in anoperon, which together with its promoter and terminator is termed asModule 2 (M2). Variants of M1 and M2 can be easily amplified from the 12plasmids and combined into new plasmids (FIG. 14a ). Two libraries werecreated. Each contained full combinations of six variants of M1 and M2(36 possibilities). The difference between the two libraries was theplasmid backbone type (FIG. 14c ). To simplify the workflow, we picked72 colonies (2 times the size of the library) after transformation of E.coli with each plasmid library, and screened them by using coumaric acidtiter. In general, the library with higher copy number replicationorigin (Library 2) produced more coumaric acid, and the top performerproduced 263 mg/L of coumaric acid, which was more than 10 times higherthan that before this combinatorial optimization (FIG. 13i ). We caneasily determine the identity of the plasmids responsible for the highercoumaric acid production by using sequencing (FIG. 14d ).

In many biotechnology applications, due to lack of accurate in silicomodels and/or in-depth understanding of mechanisms of the biologicalsystems, combinatorial optimizations have been widely used and proven tobe effective^(38, 39). In this study, we did not intend to achievehighest tyrosine and coumaric acid titer (reports with higher values areavailable^(40, 41)), instead we used these optimization exercises todemonstrate the unique features of GTas, and to provide a relevantcontext. As demonstration in this example, for the first time GTasallowed construction of plasmid library from.

Example 10: Feature of GTas—being Able to Handle Short Fragments

GTas has also been used to construct plasmids from standard parts forvarious applications that need to use short genetic elements (includingE. coli gene editing through CRISPR/Cas9 system), which are described inFIG. 19 and FIG. 21.

Example 10: Other Materials and Methods

Chemicals

All the chemicals were purchased from Sigma-Aldrich unless otherwisestated. All DNA oligos used in this work were synthesized by IntegratedDNA Technologies and Guangzhou IGE Biotechnology LTD, and the DNA oligosequence information is provided in Table 14, 16, 17, 19 and 21.

TABLE 16 List of oligos used to prepare Boligos in this study RG-f RG-r(Conventional (Conventional Sequence (/5Phos/: oligo) Sequence oligo)5′-end phosphorylation) N21 cctcgactcacacttg N21/5Phos/ccaagtgtgagtcgagg (SEQ ID NO: 362) (SEQ ID NO: 363) N22acacaacatagccacggG N22 /5Phos/ccgtggctatgttgtgt (SEQ ID NO: 364)(SEQ ID NO: 365) N23 caaacagaaaggccatggG N23 /5Phos/ccatggcctttctgtttg(SEQ ID NO: 366) (SEQ ID NO: 367) N24 ctcatggattctacgggG N24/5Phos/cccgtagaatccatgag (SEQ ID NO: 368) (SEQ ID NO: 369) pJ23119tagctcagtcctaggtataata pJ23119 /5Phos/tagtattatacctagg ctaG actgagcta(SEQ ID NO: 370) (SEQ ID NO: 371) LG-f Sequence (/5Phos/: LG-r(Conventional 5′-end (Conventional oligo) phosphorylation oligo)Sequence N21 /5Phos/tccctcgactcacactt N21 aagtgtgagtcgagggaA(SEQ ID NO: 372) (SEQ ID NO: 373) N22 /5Phos/tcacacaacatag N22gtggctatgttgtgtgaA ccac (SEQ ID NO: 375) (SEQ ID NO: 374) N23/5Phos/tcaaacagaaagg N23 atggcctttctgtttgaA ccat (SEQ ID NO: 377)(SEQ ID NO: 376) N24 /5Phos/tcctcatggattctacg N24 cgtagaatccatgaggaA(SEQ ID NO: 378) (SEQ ID NO: 379) pJ23119 /5Phos/tgacagctagctca pJ23119cctaggactgagctagctgtcaA gtcctagg (SEQ ID NO: 381) (SEQ ID NO: 380)Sequence (blue and Sequence (blue and red highlighted red highlightedsequences are stem sequences are stem regions that are regions that arecomplementary to complementary to each other, and thereeach other, and there is a 6 bp loop (upper is a 6 bp loop (upperRG-Boligo case) region between LA-Boligo case) region between(Novel oligo) them) (Novel oligo) them) RBS1 AtatgtatatctccttcttaaagtRBS1 agaaataattttgtttaactttaag taaacaaTATGTTttgttta aaggTCTACTccttcttaaaactttaagaaggagatataca gttaaacaaaattatttctA tatG (SEQ ID NO: 383)(SEQ ID NO: 382) SRBS2 atcgaacaatccttttgtgata SRBS2aaccgttcatttatcacaaaag aatgaaTCGGTTttcattta gaTATCACtccttttgtgtga tcaatgaacggttA acaaaaggattgttcgatG (SEQ ID NO: 385) (SEQ ID NO: 384) SRBS3atagctgtttcctgtgtGCCT SRBS3 gattcacacaggaaacagctT GGacacaggaaacagctCTTCGagctgtttcctgtgtga atG atcA (SEQ ID NO: 386) (SEQ ID NO: 387) SRBS4atgggaaccttcacctgAAG SRBS4 aaattaattgttcttttttcaggtga TTAcaggtgaaggttcccaggttcccTCACATgggaa atG ccttcacctgaaaaaagaaca (SEQ ID NO: 388) attaatttA(SEQ ID NO: 389) pJ23119 tagtattatacctaggactgag pJ23119tgacagctagctcagtcctagg ctaAGAGGGtagctca GCACAGcctaggactgagcgtcctaggtataatactaG tagctgtcaA (SEQ ID NO: 390) (SEQ ID NO: 391) N21ccaagtgtgagtcgaggAA N21 tccctcgactcacacttGCGA GGGCcctcgactcacacGAaagtgtgagtcgagggaA ttggG (SEQ ID NO: 393) (SEQ ID NO: 392) N22ccgtggctatgttgtgtCGTA N22 tcacacaacatagccacTTG TTacacaacatagccacCTGgtggctatgttgtgtgaA ggG (SEQ ID NO: 395) (SEQ ID NO: 394) N23ccatggcctttctgtttgACCA N23 tcaaacagaaaggccatCAT TAcaaacagaaaggccaTCAatggcctttctgtttgaA tggG (SEQ ID NO: 397) (SEQ ID NO: 396) N24cccgtagaatccatgagAA N24 tcctcatggattctacgACTAT CCCGctcatggattctaGcgtagaatccatgaggaA cgggG (SEQ ID NO: 399) (SEQ ID NO: 398) 3UTR1ttaaacccttcagcccaCAC 3UTR1 gatgggctgaagggtttCACA ACAtgggctgaagggCAaaacccttcagcccatcA tttaaG (SEQ ID NO: 401) (SEQ ID NO: 400) N32ttgtaagctgtgatgggGCC N32 gtcccatcacagcttacGCTT TGGcccatcacagcttacaCGgtaagctgtgatgggacA aG (SEQ ID NO: 403) (SEQ ID NO: 402) LK3cctgaagaagacgagccCA LK3 caggctcgtcttcttcaCACA CACAggctcgtcttcttcaggCAtgaagaagacgagcctgA G (SEQ ID NO: 405) (SEQ ID NO: 404) 5UTRatttttctagaaaacttagatta 5UTR agcaatctaatctaagttGCA gattgcAAGTTAgcaatctCATaacttagattagattgctA aatctaagttttctagaaaaat (SEQ ID NO: 407) G(SEQ ID NO: 406) 3UTR2 ttttagtgatggtgatgatgatg 3UTR2aacatcatcatcaccatcaAT GGCAATcatcatcatcacc GTACtgatggtgatgatgatgatcactaaaaG ttA (SEQ ID NO: 408) (SEQ ID NO: 409) Sequence (blue andSequence (blue and red highlighted red highlighted sequences are stemsequences are stem regions that are regions that are complementary tocomplementary to each other, and there each other, and thereis a 6 bp loop (upper a 6 bp loop (upper RA-Boligo case) region betweenLG-Boligo case) region between (Novel oligo) them) (Novel oligo) them)N36 ctgtacattcgcccttaAGCT N36 actaagggcgaatgtacAGC TGtaagggcgaatgtacagTTGgtacattcgcccttagtG A (SEQ ID NO: 411) (SEQ ID NO: 410) ICeuIttcgctaccttaggaccgttat ICeuI aactataacggtcctaaggtaA AGCTTGataacggtcctaGCTTGtaccttaggaccgtta aggtagcgaaA tagttG (SEQ ID NO: 412)(SEQ ID NO: 413) N22 ccgtggctatgttgtgtAGCT N22 tcacacaacatagccacAGCTGacacaacatagccacgg TTGgtggctatgttgtgtgaG (SEQ ID NO: 414)(SEQ ID NO: 415) 3UTR2 ttttagtgatggtgatgatgatg 3UTR2aacatcarcatcaccatcaAT GGCAATcatcatcatcacc GTACtgatggtgatgatgatgatcactaaaaA ttG (SEQ ID NO: 416) (SEQ ID NO: 417) N21ccaagtgtgagtcgaaggAA N21 tccctcgactcacacttGCGA GGGCcctcgactcaacacttGAaagtgtgagtcgagggaG ggA (SEQ ID NO: 419) (SEQ ID NO: 418) N23ccatggcctttctgtttgACCA N23 tcaaacagaaaggccatCAT TAcaaacagaaaggccatgTCAatggcctttctgtttgaG gA (SEQ ID NO: 421) (SEQ ID NO: 420)

PCR

PCR reaction solution in this work contained 1-5 μL of template DNA, 0.3μL of 100 μM forward oligo, 0.3 μL of 100 μM reverse oligo, 25 μL of Q5®Hot Start High-Fidelity 2× Master Mix (M0494, New England Biolabs[NEB]), and ultrapure water to top up to 50 μL. The cycling conditionwas based on the manufacturer's instruction. All amplified DNA fragmentswere separated by standard gel electrophoresis and then purified byusing commercial column according to manufacturer's instructions(GeneJET Gel Extraction Kit, K0691, Thermo Fisher Scientific). At theend of the purification, DNA was eluted from the column by using 40 μLof nuclease-free water (BUF-1180, 1st BASE Biochemicals [1st BASE]) in1.7 mL Eppendorf tube.

Chemical Cleavage of Phosphorothioate (PS)-Modified DNA

To cleave PS bond in DNA molecules (FIG. 1c ), forty microliters ofpurified DNA solution was mixed with 5.5 μL of 1 M Tris solution (3021,1st BASE, pH adjusted to 9) and 10 μL of 30 g/L iodine solution (iodine:207772, Sigma-Aldrich; solvent: ethanol), and was incubated at 70° C.for 5 min in a water bath. The solution was diluted with 250 μL ofnuclease-free water, and purified by using commercial DNA purificationcolumn as described above.

Preparation of Fragment

Fragments can be amplified from various sources (plasmid, synthetic DNA,genomic DNA etc.) by using PCR and Foligos (FIG. 10c and FIG. 21).Chemical treatment of the PS bond left ‘C’ or ‘T’ sticky end on one endof fragment which could be paired with a Boligo that has the compatiblesticky end (FIG. 1c , FIGS. 2a and 2b ). Note that the chemicallytreated fragments must be purified by using column (GeneJET GelExtraction Kit, K0691, Thermo Fisher Scientific). Short fragments can bedirectly created by annealing Noligos (FIG. 20). The annealing reactionsolution contained 50 μL of 100 μM G-Noligo and 50 μL of 100 μMA-Noligo. The annealing was done by using the following program in athermo cycler: 98° C. for 2 min, 98 to 75° C. at rate of 0.1° C./s, 75°C. for 2 min, 75 to 45° C. at rate of 0.1° C./s, 45° C. for 2 min, 45°C. to 4° C. at rate of 0.1° C./s and hold at 4° C. The fragmentsprepared by using Noligos were diluted with nuclease-free water to 10-20ng/μL for barcoding, and did not require purification. All the fragmentsused in this study are listed in Table 14.

Phosphorylation and Folding of Boligos

Boligos need to have phosphate group at 5′ end and properly folded. Toreduce oligo synthesis cost, we ordered regular oligos and used T4kinase to add phosphate group. The phosphorylation reaction solutioncontained 1 μL of 100 μM Boligo, 2 μL of 10× T4 ligase buffer (B0202,NEB), 0.5 μL of T4 kinase (B0201, NEB) and 16.5 μL of nuclease-freewater. Phosphorylation and folding of Boligo were done by using thefollowing condition in a thermo cycler: 37° C. for 30 min(phosphorylation), 65° C. for 20 min (inactivation of T4 kinase), 98° C.for 2 min (DNA denaturing), 98 to 45° C. at rate of 0.1° C./s, 45° C.for 2 min, 45° C. to 4° C. at rate of 0.1° C./s, and hold at 4° C. Theprepared Boligos (diluted properly by using ultrapure water) can bedirectly used in subsequent reactions without purification. All theBoligos (conventional and novel oligo design) used in this study arelisted in Table 16. The workflow for creating conventional Boligos iselaborated in FIG. 17.

Barcoding

Prepared fragments and barcodes were ligated by using a commercial kit(Blunt/TA Ligase Master Mix, M0367, New England Biolabs). The type ofligase was critical in this step (results from using different ligasesare provided in FIG. 24). The ligation reaction solution contained 3 μLof fragment, 0.3 μL of 1.25 μM L(G/A)-Boligo, 0.3 μL of 1.25 μMR(G/A)-Boligo, and 3.6 μL of Blunt/TA Ligase Master Mix. The ligationreaction was done by using the following program in a thermo cycler: 25°C. for 5 min, and hold at 4° C. It is critical to have sufficient amountof high quality fragment in the ligation reaction. The recommendedminimal fragment concentration is 10 ng/μL for fragment no longer than 1kb. The concentration refers to that of fragment with one-nt stickyends. If fragment is larger than 1 kb, we recommend to use at least 100ng/μL; if fragment is larger than 2 kb, we recommend to use at least 200ng/μL. We used Vacufuge (Eppendorf™ Vacufuge™ Concentrator) toconcentrate fragment solution when its concentration was too low.Absorbance spectrum (200-300 nm) of each fragment solution should beexamined by using Nanodrop (NanoDrop™ 2000/2000c Spectrophotometers,Thermo Fisher Scientific) or a similar device to ensure there is a peakat 260 nm before its fragment concentration data can be used in thecalculation. We have successfully barcoded fragments between 0.035 kb to5.259 kb.

After ligation, corresponding Aoligos (FIG. 12c ) were used to amplifycorrectly barcoded fragments by using PCR. This step was termed asligation PCR. The ligation product was directly used as template DNA inPCR without purification/dilution. PCR product was purified andchemically cleaved as specified in PCR and Chemical Cleavage ofphosphorothioate (PS)-modified DNA. All Aoligos used in this study arelisted in Table 17. All barcoded fragments used in this study are listedin Table 18.

TABLE 17 List of Aoligos used in this study Sequence Sequence RG-Aoligo(*: phosphorothioate bond) LA-Aoligo (*: phosphorothioate bond)RBS1-G-Assemble ttgtttaac*tttaagaagg*agatata RBS1-T-ccttcttaa*agttaaacaa*aattattt catatG Assemble ctA (SEQ ID NO: 422)(SEQ ID NO: 423) SRBS2-G-Assemble ttcatttat*cacaaaagga*ttgttcg SRBS2-T-tccttttgt*gataaatgaa*cggttA atG Assemble (SEQ ID NO: 425)(SEQ ID NO: 424) SRBS3-G-Assemble acacagga*aacagct*atG SRBS3-T-agctgttt*cctgtgt*gaatcA (SEQ ID NO: 426) Assemble (SEQ ID NO: 427)SRBS4-G-Assemble caggtgaa*ggttccc*atG SRBS4-T-gggaacct*tcacctg*aaaaaagaac (SEQ ID NO: 428) Assemble aattaatttA(SEQ ID NO: 429) pJ23119-G-Assemble tagctca*gtcctagg*tataatactaGpJ23119-T- cctagga*ctgagcta*gctgtcaA (SEQ ID NO: 430) Assemble(SEQ ID NO: 431) N21-G-Assemble cctcgac*tcacactt*ggG N21-T-aagtgtg*agtcgagg*gaA (SEQ ID NO: 432) Assemble (SEQ ID NO: 433)N22-G-Assemble acacaac*atagccac*ggG N22T-T- gtggcta*tgttgtgt*gaA(SEQ ID NO: 434) Assemble (SEQ ID NO: 435) N23-G-Assemblecaaacaga*aaggccat*ggG N23-T- atggcctt*tctgtttg*aA (SEQ ID NO: 436)Assemble (SEQ ID NO: 437) N24-G-Assemble ctcatgg*attctacg*ggG N24-T-cgtagaa*tccatgag*gaA (SEQ ID NO: 438) Assemble (SEQ ID NO: 439)3UTR1-G-Assemble tgggctg*aagggttt*aaG 3UTR1-T- aaaccct*tcagccca*tcA(SEQ ID NO: 440) Assemble (SEQ ID NO: 441) N32-G-Assemblecccatcac*agcttac*aaG N32T- gtaagctg*tgatggg*acA (SEQ ID NO: 442)Assemble (SEQ ID NO: 443) LK3-G-Assemble ggctcgt*cttcttca*ggG LK3-T-tgaagaa*gacgagcc*tgA (SEQ ID NO: 444) Assemble (SEQ ID NO: 445)fiveUTR-G-Assemble gcaatctaa*tctaagtt*ttctagaa fiveUTR-Taacttagat*tagattgc*tA aaatG Assemble (SEQ ID NO: 447) (SEQ ID NO: 446)3UTR2-G-Assemble catcatcat*caccatca*ctaaaaG 3UTR2-T-tgatggtga*tgatgatg*ttA (SEQ ID NO: 422) Assemble (SEQ ID NO: 449)N36-A-Assemble taagggcg*aatgtac*agA N36-C- gtacattc*gcctta*gtG(SEQ ID NO: 450) Assemble (SEQ ID NO: 451) ICeuI-A-Assembleataacggtc*ctaaggta*gcgaaA ICeuI-C- taccttagg*accgttat*agttG(SEQ ID NO: 452) Assemble (SEQ ID NO: 453 N22-A-Assembleacacaaca*tagccac*ggA N22-C- gtggctat*gttgtgt*gaG (SEQ ID NO: 454)Assemble (SEQ ID NO: 455) 3UTR2-A-Assemble catcatcat*caccatca*ctaaaaA3UTR2-C- tgatggtga*tgatgatg*ttG (SEQ ID NO: 456) Assemble(SEQ ID NO: 457) N21-A-Assemble cctcgac*tcacactt*ggA N21-C-aagtgtg*agtcgagg*gaG (SEQ ID NO: 458) Assemble (SEQ ID NO: 459)N23-A-Assemble caaacaga*aaggccat*ggA N23C- atggcctt*tctgtttg*aG(SEQ ID NO: 460) Assemble (SEQ ID NO: 461)

TABLE 18 List of barcoded fragments used in this study SN Barcodedfragment Annotation 1 (pJ23119-RG-Boligo)gRNA- guide RNA with promoternupG(N23-LA-Boligo) 2 (pJ23119-RG-Boligo)gRNA-aslA(N23- guide RNA withpromoter LA-Boligo) 3 (pJ23119-RG-Boligo)gRNA-melB(N23- guide RNA withpromoter LA-Boligo) 4 (pJ23119-RG-Boligo)gRNA-rcsB(N23- guide RNA withpromoter LA-Boligo) 5 (pJ23119-RG-Boligo)gRNA-tyrR(N23- guide RNA withpromoter LA-Boligo) 6 (pJ23119-RG-Boligo)gRNA-pheA(N23- guide RNA withpromoter LA-Boligo) 7 (pJ23119-RG-Boligo)gRNA-ptsI(N23- guide RNA withpromoter LA-Boligo) 8 (pJ23119-RG-Boligo)gRNA- guide RNA with promotermanZ(N23-LA-Boligo) 9 (pJ23119-RG-Boligo)gRNA-glk(N23- guide RNA withpromoter LA-Boligo) 10 (pJ23119-RG-Boligo)gRNA-xylB(N23- guide RNA withpromoter LA-Boligo) 11 (N23-RG-Boligo)nupG-HF0.5(N24-LA- Upstreamhomologous Boligo) sequence 12 (N23-RG-Boligo)nupG-HF1.0(N24-LA-Upstream homologous Boligo) sequence 13(N23-RG-Boligo)nupG-HF1.5(N24-LA- Upstream homologous Boligo) sequence14 (N23-RG-Boligo)aslA-HF0.5(N24-LA- Upstream homologous Boligo)sequence 15 (N23-RG-Boligo)aslA-HF1.0(N24-LA- Upstream homologousBoligo) sequence 16 (N23-RG-Boligo)melB-HF1.0(N24-LA- Upstreamhomologous Boligo) sequence 17 (N23-RG-Boligo)rcsB-HF1.0(N24-LA-Upstream homologous Boligo) sequence 18(N23-RG-Boligo)tyrR-HF0.5(N24-LA- Upstream homologous Boligo) sequence19 (N23-RG-Boligo)pheA-HF0.5(N24-LA- Upstream homologous Boligo)sequence 20 (N23-RG-Boligo)ptsI-HF0.7(N24-LA- Upstream homologousBoligo) sequence 21 (N23-RG-Boligo)manZ-HF0.5(N24-LA- Upstreamhomologous Boligo) sequence 22 (N23-RG-Boligo)glk-HF0.5(N24-LA- Upstreamhomologous Boligo) sequence 23 (N23-RG-Boligo)xylB-HF0.7(N24-LA-Upstream homologous Boligo) sequence 24(N24-RG-Boligo)nupG-HT0.5(N21-LA- Downstream homologous Boligo) sequence25 (N24-RG-Boligo)nupG-HT1.0(N21-LA- Downstream homologous Boligo)sequence 26 (N24-RG-Boligo)nupG-HT1.5(N21-LA- Downstream homologousBoligo) sequence 27 (N24-RG-Boligo)aslA-HT0.5(N21-LA- Downstreamhomologous Boligo) sequence 28 (N24-RG-Boligo)aslA-HT1.0(N21-LA-Downstream homologous Boligo) sequence 29(N24-RG-Boligo)melB-HT1.0(N21-LA- Downstream homologous Boligo) sequence30 (N24-RG-Boligo)rcsB-HT1.0(N21-LA- Downstream homologous Boligo)sequence 31 (N24-RG-Boligo)tyrR-HT0.5(N21-LA- Downstream homologousBoligo) sequence 32 (N24-RG-Boligo)pheA-HT0.5(N21-LA- Downstreamhomologous Boligo) sequence 33 (N24-RG-Boligo)ptsI-HT0.7(N21-LA-Downstream homologous Boligo) sequence 34(N24-RG-Boligo)manZ-HT0.5(N21-LA- Downstream homologous Boligo) sequence35 (N24-RG-Boligo)glk-HT0.5(N21-LA- Downstream homologous Boligo)sequence 36 (N24-RG-Boligo)xylB-HT0.7(N21-LA- Downstream homologousBoligo) sequence 37 (N21-RG-Boligo)aadA(N22-LA-Boligo) Antibioticresistence gene expression cassette (spectinomycin) 38(N21-RG-Boligo)bla(N22-LA-Boligo) Antibiotic resistence gene expressioncassette (ampicillin) 39 (N22-RG-Boligo)pSC101(pJ23119-LA- Replicationorigin-low copy Boligo) number 40 (N22-RG-Boligo)pAC(pJ23119-LA-Replication origin-medium Boligo) copy number 41(N22-RG-Boligo)pMB1(pJ23119-LA- Replication origin-medium Boligo) copynumber 42 (N22-RG-Boligo)pUC(pJ23119-LA- Replication origin-high copyBoligo) number 43 (N23-RG-Boligo)pLac(RBS1-LA- Promoter Boligo) 44(N23-RG-Boligo)LaclPT7(RBS1-LA- Promoter with Lacl repressor Boligo)coding sequence 45 (3UTR1-RG-Boligo)t7t(N21-LA-Boligo) Terminator 46(3UTR1-RG-Boligo)t7t(LK3-LA-Boligo) Terminator 47(LK3-RG-Boligo)Lacl(Lacl-LA-Boligo) Lacl repressor expression cassette48 (RBS1-RG-Boligo)Lacl(SRBS4-LA- Coding sequence Boligo) 49(RBS1-RG-Boligo)tyrA Coding sequence mutant(SRBS4-LA-Boligo) 50(SRBS4-RG-Boligo)tyrA Coding sequence mutant(3UTR1-LA-Boligo) 51(SRBS4-RG-Boligo)aroG Coding sequence mutant(3UTR1-LA-Boligo) 52(N23-RG-Boligo)pthrC3(RBS1-LA- Promoter Boligo) 53(N22-RG-Boligo)pAC(RBS1-LA- Replication origin-medium Boligo) copynumber 54 (SRBS2-RG-Boligo)tal(3UTR1-LA- Coding sequence Boligo) 55(SRBS4-RG-Boligo)aroG Coding sequence mutant(SRBS2-LA-Boligo) 56(SRBS4-RG-Boligo)tyrA Coding sequence mutant(SRBS2-LA-Boligo) 57(SRBS2-RG-Boligo)aroG Coding sequence mutant(3UTR1-LA-Boligo) 58(SRBS2-RG-Boligo)tyrA Coding sequence mutant(3UTR1-LA-Boligo) 59(RBS1-RG- Coding sequence Boligo)SeSam8_tal(SRBS4-LA-Boligo) 60(SRBS4-RG- Coding sequence Boligo)SeSam8_tal(SRBS2-LA-Boligo) 61(SRBS2-RG- Coding sequence Boligo)SeSam8_tal(3UTR1-LA-Boligo) 62(RBS1-RG-Boligo)ppsA(SRBS4-LA- Coding sequence Boligo) 63(RBS1-RG-Boligo)tktA(SRBS4-LA- Coding sequence Boligo) 64(RBS1-RG-Boligo)aroE(SRBS4-LA- Coding sequence Boligo) 65(SRBS4-RG-Boligo)ppsA(SRBS2-LA- Coding sequence Boligo) 66(SRBS4-RG-Boligo)tktA(SRBS2-LA- Coding sequence Boligo) 67(SRBS4-RG-Boligo)aroE(SRBS2-LA- Coding sequence Boligo) 68(SRBS2-RG-Boligo)ppsA(3UTR1-LA- Coding sequence Boligo) 69(SRBS2-RG-Boligo)tktA(3UTR1-LA- Coding sequence Boligo) 70(SRBS2-RG-Boligo)aroE(3UTR1-LA- Coding sequence Boligo) 71(N36-RG-Boligo)SpecR(ICeul-LA- Antibiotic resistence gene Boligo)expression cassette (spectinomycin) 72(N36-RA-Boligo)SpecR-flipped(ICeul- Antibiotic resistence geneLG-Boligo) expression cassette (spectinomycin) 73(N36-LG-Boligo)SpecR(ICeul-LA- Antibiotic resistence gene Boligo)expression cassette (spectinomycin) 74(N36-RA-Boligo)SpecR-flipped(ICeul- Antibiotic resistence geneRG-Boligo) expression cassette (spectinomycin) 75(ICeul-RG-Boligo)pMB1(N22-RA- Replication origin-medium Boligo) copynumber 76 (ICeul-LA-Boligo)pMB1-flipped(N22- Replication origin-mediumLG-Boligo) copy number 77 (ICeul-LG-Boligo)pMB1(N22-LA- Replicationorigin-medium Boligo) copy number 78 (ICeul-RA-Boligo)pMB1-flipped(N22-Replication origin-medium RG-Boligo) copy number 79(N22-RG-Boligo)P-gfp-T(N36-LA- GFP expression cassette Boligo) 80(N22-LG-Boligo)P-gfp-T(N36-LA- GFP expression cassette Boligo)

Direct Amplification of Barcoded Fragment from Constructed Plasmids

Barcoded fragments can also be directly amplified by using Aoligos froma plasmid if this plasmid contains the barcoded fragment (FIGS. 13d, 13eand 13f , FIG. 14a and FIG. 21). In such case, barcoding was not needed.The PCR product can be chemically cleaved and used in DNA assembly.

DNA Assembly

We revised the CLIVA method to develop a new DNA assembly method that ishighly efficient (FIG. 20). The assembly reaction solution contained 0.5μL of

Taq DNA ligase (M0208, NEB), 0.5 μL of the 10× buffer for the Taq DNAligase, and 4 μL of mixture containing the barcoded fragments (they musthave close to the same molar concentration). The recommended minimalmolar concentration of barcoded fragment is 10 nM. The ligation reactionwas done 45° C. for 12 h by using thermo cycler.

One microliter of ligation product was mixed with 17 μL of E. coli Dh5aheat-shock competent cell solution (C2987H, NEB) in a pre-chilled 1.7 mLtube on ice (Axygen). The tube was heat-shocked in a 42° C. water bathfor exactly 35 s and was quenched on ice. The cell solution was mixedwith 150 μL of SOC medium (NEB) and directly plated on LB Agar platethat contained a proper antibiotic. The plate was incubated temperaturerequired by specific applications. Usually colony appeared after 12 hwhen incubated at 37° C.

Colony PCR and Sanger sequencing were carried out to determine accuracyof each DNA assembly. The accuracy was product of colony PCR accuracyand sequencing accuracy. Colony PCR accuracy was the ratio of the numberof positive colonies (determined by colony PCR) to that of all thetested colonies. For each DNA assembly, one or more positive colonie(s)were cultured in LB with proper antibiotics overnight and the plasmidsextracted from them were further tested by Sanger sequencing (Serviceprovider: Axil Scientific, AlTbiotech, and BioBasic). Sequencingaccuracy was the ratio of the number of positive plasmids (free ofmutation/deletion/insertion in sequenced region) to that of all thesequenced plasmids. Colony PCR reaction solution contained 1 μL ofcolony suspension (one single colony was re-suspended in 100 μL ofultrapure water), 0.15 μL of 100 μM forward oligo, 0.15 μL of 100 μMreverse oligo, 5 μL of Q5 Hot Start High-Fidelity 2× Master Mix, and 3.7μL of ultrapure water.

Genome editing of E. coli.

For genome editing of E. coli MG1655_ΔrecA_ΔendA_DE3, we utilized atwo-plasmid CRISPR/Cas9 system² and constructed a few plasmids targetingvaried loci (Table 19). Colony PCR verification was performed toevaluate the efficiency of gene deletion at the selected locus, and afull list of the targeted locus and oligos used in colony PCR areprovided in Table 19.

Strains

A list of strains used and constructed in this study is provided inTable 20. Each strain was derived from its parental strain throughplasmid transformation done by using the standard electroporationprotocol.

Llist of the efficiency of gene deletion at the targeted locus and oligos used to colony PCR in this study Gene Upstream deletion homologous efficiencysequence (Correct length/ colony Downstream number/ homologous TestedColony PCR Colony PCR Colony PCR Colony PCR Plasmids Parental sequencecolony forward forward oligo reverse reverse oligo name strain Locuslength (bp) number) oligo name sequence oligo name sequence GE1 MG1655_aslA  487/500 0/6 aslA TGGAACAAC aslA ACAGGCGAAA ΔrecA_ screening 4FAGGCATGGATT screening 4R TATGGTGCT ΔendA_ (SEQ ID (SEQ ID DE3 NO: 221)NO: 222) GE2 MG1655_ aslA  987/1000 1/6 aslA TGGAACAACA aslA ACAGGCGAAAΔrecA_ screening 4F GGCATGGATT screening 4R TATGGTGCT ΔendA_ (SEQ ID(SEQ ID DE3 NO: 221) NO: 222) GE3 MG1655_ nupG 4887/500 6/6 nupGGGAAATATGG nupG AGGATTATCCG ΔrecA_ screening 2F CGTTGATGAG screening 2RACATCAGTG ΔendA_ (SEQ ID (SEQ ID DE3 NO: 223) NO: 224) GE4 MG1655_ tyrR 422/507 3/16 tyrR AACGCTGGTA tyrR AGGCTTCCTC ΔrecA_ screening FTGCCTCAATC screening R GAATACCTTA ΔendA_ (SEQ ID (SEQ ID DE3 NO: 227)NO: 228) GE5 MG1655_ pheA  454/589 6/6 pheA TCATCAAATA pheA TCGAGCGGCTΔrecA_ screening F TGGCTCGCTT screening R GATATTGTTG ΔendA_ (SEQ ID(SEQ ID ΔtyrR_ NO: 225) NO: 226) DE3 GE6 MG1655_ nupG 1004/1050 8/8 nupGTATTGTGCCT nupG CGAATAAAGTG ΔrecA_ screening 4F ATGTGGCCTTC screening 4RGTGACGAATG ΔendA_ (SEQ ID (SEQ ID DE3 NO: 227) NO: 228) GE7 MG1655_ nupG1580/1501 6/8 nipG TATTGTGCCT nupG CGAATAAAGTGG ΔrecA_ screening 4FATGTGGCTTC screening 4R TGACGAATG ΔendA_ (SEQ ID (SEQ ID DE3 NO: 229)NO: 230) GE8 MG1655_ melB  993/1039 7/8 melB GTAAGCGGC melB GCAGGCCGTΔrecA_ screening 2F ATGGTCTGGAAC screening 2R ATGGACTCCTA ΔendA_ (SEQ ID(SEQ ID DE3 NO: 231) NO: 232) GE9 MG1655_ rcsB 1147/1026 8/8 rcsBAACTGGCGAA rcsB GCGATTATCT ΔrecA_ screening 3F TCAGGCAGA screening 3RCTCTATCCGT ΔendA_ (SEQ ID (SEQ ID DE3 NO: 233) NO: 234) GE10 MG1655_ptsHIcrr  656/776 0/4 ptsH/I_crr AGACCGATCTT ptsH/I_crr TAGTGTAATGAΔrecA_ screening F (SEQ ID (SEQ ID ΔendA_ NO: 235 NO: 236) GE18 MG1655_xylB  512/500 7/8 xylB TCCTGAAACAG xylB CATGGATAGCT ΔrecA_ screening FTTTGGTCTGGA screening R CTCGTTGGT ΔendA_ (SEQ ID (SEQ ID DE3 NO: 462)NO: 463)

TABLE 20 List of strains constructed in this study Plasmid genotype(Barcodes used in one plasmid are placed into brackets ApplicationStrain name E. coli strain genotype between fragments) in this study A0BL21(DE3) (N21)SpecR(N22)pMB1 Empty (N23)t7t plasmid_FIG. 3e A1BL21(DE3) SpecR(Ceul)pMB1 Empty (N22)pthrC3_gfp_t7t(N36) plasmid_FIG. 3eA2 BL21(DE3) SpecR flippedICeul)pMB1 Empty (N22)pthrC3_gfp_t7t(N36)plasmid_FIG. 3e A3 BL21(DE3) SpecR(Ceul)pMB1 Emptyflipped(N22)pthrC3_gfp_t7t(N36) plasmid_FIG. 3e A4 BL21(DE3) SpecRflipped(Ceul)pMB1 Empty flipped(N22)pthrC3_gfp_t7t(N36) plasmid_FIG. 3eA5 BL21(DE3) SpecR(Ceul flipped)pMB1 Empty (N22)pthrC3_gfp_t7t(N36)plasmid_FIG. 3e A6 BL21(DE3) SpecR flipped(Ceul Emptyflipped)pMB1(N22)pthrC3_gfp_t7t(N36) plasmid_FIG. 3e A7 BL21(DE3)SpecR(Ceul flipped)pMB1 Empty flipped(N22)pthrC3_gfp_t7t(N36)plasmid_FIG. 3e A8 BL21(DE3) SpecR flipped(Ceul flipped)pMB1 Emptyflipped(N22)pthrC3_gfp_t7t(N36) plasmid_FIG. 3e TPP0MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 (N21)SpecR(N22)pAC(N23)t7t Emptyplasmid_FIG. 4a TPP1 MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3(N21)SpecR(N22)pSC101(N23)laclT7 Tyrosine (RBS1)aroG mutant(SRBS4)tyrAproduction_FIG. 4c mutant(3UTR1)t7t(N21) TPP2MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 (N21)SpecR(N22)pAC(N23)laclT7Tyrosine (RBS1)aroG mutant(SRBS4) production_FIG. 4c tyrAmutant(3UTR1)t7t(N21) TPP3 MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3(N21)SpecR(N22)pMB1(N23)laclT7 Tyrosine (RBS1)aroG mutant(SRBS4)tyrAproduction_FIG. 4c mutant(3UTR1)t7t(N21) TPP4MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 (N21)SpecR(N22)pUC((N23)laclT7Tyrosine (RBS1)aroG mutant(SRBS4) production_FIG. 4c tyrAmutant(3UTR1)t7t(N21) TPP5 MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3(N21)SpecR(N22)pSC101(N23)pLac Tyrosine (RBS1)aroG mutant(SRBS4)production_FIG. 4c tyrA mutant(3UTR1)t7t(LK3)Lacl(N21) TPP6MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 (N21)SpecR(N22)pAC(N23)pLac Tyrosine(RBS1)aroG mutant(SRBS4) production_FIG. 4c tyrAmutant(3UTR1)t7t(LK3)Lacl(N21) TPP7 MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3(N21)SpecR(N22)pMB1(N23)pLac Tyrosine (RBS1)aroG mutant(SRBS4)production_FIG. 4c tyrA mutant(3UTR1)t7t(LK3)Lacl(N21) TPP8MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 (N21)SpecR(N22)pUC((N23)pLac Tyrosine(RBS1)aroG mutant(SRBS4)tyrA production_FIG. 4cmutant(3UTR1)t7t(LK3)Lacl(N21) TPP9 MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3(N21)SpecR(N22)pSC101 Tyrosine (N23)laclT7(RBS1) tyrA production_FIG. 4cmutant(SRBS4)aroG mutant(3UTR1)t7t(N21) TPP10MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 (N21)SpecR(N22)pAC Tyrosine(N23)laclT7(RBS1) tyrA production_FIG. 4c mutant(SRBS4)aroGmutant(3UTR1)t7t(N21) TPP11 MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3(N21)SpecR(N22)pMB1 Tyrosine (N23)laclT7(RBS1)tyrA production_FIG. 4cmutant(SRBS4)aroG mutant(3UTR1)t7t(N21) TPP12MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 (N21)SpecR(N22)pUC Tyrosine((N23)laclT7(RBS1) tyrA production_FIG. 4c mutant(SRBS4)aroGmutant(3UTR1)t7t(N21) TPP13 MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3(N21)SpecR(N22)pSC101 Tyrosine (N23)pLac(RBS1)tyrA production_FIG. 4cmutant(SRBS4)aroG mutant(3UTR1)t7t(LK3)Lacl(N21) TPP14MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 (N21)SpecR(N22)pAC(N23)pLac(RBS1)Tyrosine tyrA mutant(SRBS4)aroG production_FIG. 4cmutant(3UTR1)t7t(LK3)Lacl(N21) TPP15 MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3(N21)SpecR(N22)pMB1(N23)pLac(RBS1) Tyrosine aroG mutant(SRBS4)tyrAproduction_FIG. 4c mutant(3UTR1)t7t(LK3)Lacl(N21) TPP16MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 (N21)SpecR(N22)pUC(N23)pLac(RBS1)Tyrosine tyrA mutant(SRBS4)aroG production_FIG. 4cmutant(3UTR1)t7t(LK3)Lacl(N21) TPP17 MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3(N21)SpecR(N22)pAC(N23)PthrC3(RBS1) Tyrosine aroG mutant(SRBS4)tyrAmutant(3UTR1)t7t production_FIG. 4d TPP18MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 (N21)SpecR(N22)pAC(RBS1)aroG mutantTyrosine (SRBS4)tyrA mutant(3UTR1)t7t production_FIG. 4e PCAP1MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 (N21)SpecR(N22)pAC(N23)PthrC3(RBS1)Coumaric aroG mutant(SRBS4)tyrA acid mutant(RBS2stop)tal(3UTR1)t7tproduction_FIG. 4f Library Dh5α (5UTR)pthrC3(RBS1)ppsA(SRBS4)aroEConstruct 1_M1_Plasmid 1 (SRBS2)tktA(3UTR1)t7t(3UTR2)SpecR(N22)pAC M1variant 1_FIG. 5a Library Dh5α (5UTR)pthrC3(RBS1)ppsA(SRBS4)tktAConstruct 1_M1_Plasmid 2 (SRBS2)aroE(3UTR1)t7t(3UTR2)SpecR(N22)pAC M1variant 2_FIG. 5a Library Dh5α (5UTR)pthrC3(RBS1)tktA(SRBS4)aroEConstruct 1_M1_Plasmid 3 (SRBS2)ppsA(3UTR1)t7t(3UTR2)SpecR(N22)pAC M1variant 3_FIG. 5a Library Dh5α (5UTR)pthrC3(RBS1)tktA(SRBS4)ppsAConstruct 1_M1_Plasmid 4 (SRBS2)aroE(3UTR1)t7t(3UTR2)SpecR(N22)pAC M1variant 4_FIG. 5a Library Dh5α (5UTR)pthrC3(RBS1)aroE(SRBS4)ppsAConstruct 1_M1_Plasmid 5 (SRBS2)tktA(3UTR1)t7t(3UTR2)SpecR(N22)pAC M1variant 5_FIG. 5a Library Dh5α (5UTR)pthrC3(RBS1)aroE(SRBS4) Construct1_M1_Plasmid 6 tktA(SRBS2)ppsA M1 variant (3UTR1)t7t(3UTR2)SpecR(N22)pAC6_FIG. 5a Library Dh5α (3UTR2)pthrC3(RBS1)aroG mutant(SRBS4) Construct2_M2_Plasmid 1 tal(SRBS2)tyrA mutant M2 variant(3UTR1)t7t(N36)SpecR(N22)pAC 1_FIG. 5a Library Dh5α(3UTR2)pthrC3(RBS1)aroG mutant(SRBS4) Construct 2_M2_Plasmid 2 tyrAmutant(SRBS2)tal M2 varient (3UTR1)t7t(N36)SpecR(N22)pAC 2_FIG. 5aLibrary Dh5α (3UTR2)pthrC3(RBS1)tyrA mutant(SRBS4) Construct2_M2_Plasmid 3 tal(SRBS2)aroG mutant M2 variant3UTR1)t7t(N36)SpecR(N22)pAC 3_FIG. 5a Library Dh5α(3UTR2)pthrC3(RBS1)tyrA mutant(SRBS4) Construct 2_M2_Plasmid 4 aroGmutant(SRBS2)tal M2 varient (3UTR1)t7t(N36)SpecR(N22)pAC 4_FIG. 5aLibrary Dh5α (3UTR2)pthrC3(RBS1)tal(SRBS4)tyrA Construct 2_M2_Plasmid 5mutant(SRBS2)aroG mutant M2 variant (3UTR1)t7t(N36)SpecR(N22)pAC 5_FIG.5a Library Dh5α (3UTR2)pthrC3(RBS1)tal(SRBS4)aroG Construct 2_M2_Plasmid6 mutant(SRBS2)tyrA mutant M2 varient (3UTR1)t7t(N36)SpecR(N22)pAC6_FIG. 5a Library MG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 pSC101(5UTR)Module1 Screening 1_M1 + M2 fragment mixture(3UTR2) plasmids Module 2fragmentmixture library (N36)AmpR(N22) 1_FIG. 5c LibraryMG1655_ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 pAC(5UTR)Module 1 Screening 2_M1 + M2fragment mixture(3UTR2) plasmids Module 2 fragment librarymixture(N36)SpecR(N22) 2_FIG. 5c

Construction of Combinatorial Plasmid Library

To construct variants of M1 or M2 in the first-tier construction (FIG.14a ), each fragment (ppsA, aroE, tktA, aroG, tal or tyrA) was barcodedby three sets of RG-Boligo/LA-Boligo: RBS1/SRBS4, SRBS4/SRBS2 andSRBS2/3UTR1. The barcoded fragments can be assembled into the 12plasmids when they were properly combined (FIG. 23a ). Each plasmid wasverified by colony PCR and Sanger sequencing (data not shown). Each M1plasmid (˜1 ng/uL) was used as template to amplify one M1 fragment (FIG.23a ) by using RG-Aoligo (5UTR) and LA-Aoligo (3UTR2). Each M2 plasmid(˜1 ng/uL) was used as template to amplify one M2 fragment (FIG. 23a )by using RG-Aoligo (3UTR2) and LA-Aoligo (N36). These M1 and M2fragments were considered to be barcoded because Aoligos were used inthe PCR, and thus can be directly assembled following our workflow (FIG.10c ). Six M1 fragments, six M2 fragments were equimolarly assembledwith one plasmid backbone (barcoded) in one reaction to create a mixtureof 36 plasmids (a plasmid library). Technical details in this step: thechemically cleaved fragments were ligated, and 2 μL of the ligationproduct was used to transform 34 μL of competent cells, which werespread on 90 mm LB agar with a proper antibiotic; all the obtainedcolonies were resuspended in 6 mL of LB medium, and the mixed plasmidswere extracted directly from this suspension. Two plasmid libraries wereprepared, and each library had a different plasmid backbone(pSC101+Amp^(R) or pAC+Spec^(R)).

The quality of each plasmid library was checked by using colony PCR. Inthe above step, colonies were randomly picked after competent cells weretransformed with the ligation product (a mixture). Each colony wastested by using two pairs of oligos. The first pair targeted RO and ppsA(M1), and it would generate amplicons with varied lengths when thecolonies contained plasmids that had ppsA at different positions of theoperon (FIG. 23b ). Similarly, the second pair targeted tal (M2) and AR.The expected amplicon's length is listed in FIG. 23c . Six colonies weretested for each plasmid library, and colony PCR results showed that eachplasmid library indeed contained various plasmids (FIG. 23b ). Theoligos used for colony PCR verification are listed in Table 21.

Two microliters of plasmid mixture from each library were used totransform E. coli MG1655 ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3 through thestandard electroporation procedure. Seventy-two colonies were randomlypicked for each library, and each colony was screened to determine itsability of producing coumaric acid (see the next section for how toculture the cells and determine coumaric acid concentration). After thescreening, the top two coumaric acid-producing strains of each librarywere selected and the plasmids they harbor were sequenced to elucidatethe responsible arrangement of the genetic parts.

TABLE 21List of oliqos used to colony PCR of quality control of plasmid library in this studyColony PCR Colony PCR Colony PCR Colony PCR Strain and Forwardforward oligo reverse reverse library Plasmids oligo name sequenceoligo name oligo sequence Dh5α_ pSC101(5UTR) RO1-f TAGACCCTCTGTAAATppsA-r (ppsA_ AAGCTGAGTAACATC Library 1 Module 1 (p5_library_f) TCCGlibrary_r) GTCAATA fragment mixture (SEQ ID NO: 464) (SEQ ID NO: 465)(3UTR2) Module 2 tal-f TGAACTGGCAGGTATT AR1-r (AmpR_ TAAGGGCGACACGGAfragment mixture TGTCC library_r) AATGT (N36)AmpR(N22) (SEQ ID NO: 466)(SEQ ID NO: 467) Dh5α_ pAC(5UTR)Module 1 RO2-f CAAGAGATTACGCGCA ppsA-rAAGCTGAGTAACATC Library 2 fragment mixture (p15_library_f) GACC GTCAATA(3UTR2) (SEQ ID NO: 468) (SEQ ID NO: 469) Module 2 fragment tal-fTGAACTGGCAGGTATT AR2-r (SpecR_ AGCCGTACAAATGTA mixture(N36)SpecR TGTCClibrary_r) CGGCC (N22) (SEQ ID NO: 466) (SEQ ID NO: 470)

Culture and analysis of tyrosine/coumaric acid-producing E. coli Each ofplasmid TPP1-16 was used to transform E. coli TPSO (genotype: MG1655ΔrecA_ΔendA_ΔpheA_ΔtyrR_DE3) by using standard electroporation protocol.The resulting strains were named as TPS1-16. To test these strains,single colony was inoculated into LB with 50 μg/mL of spectinomycin, andcultured at 37° C./250 rpm overnight. One hundred microliters of theovernight grown cell suspension was inoculated into 10 mL of K3 medium(composition specified below) with 50 μg/mL of spectinomycin, and theculture was incubated at 30° C./250 rpm until cell density reached0.5-1.0 (OD600), at which the culture was induced by 0.1 mM of isopropylβ-D-1-thiogalactopyranoside (IPTG). One milliliter of the induced cellswas transferred to a 14 mL round-bottom Falcon tube. If PthrC3 was used,IPTG induction was skipped. The cell culture was started in the 14 mLtube. The tube was incubated at 30° C./250 rpm for 84 h.

At the end of incubation, one hundred microliters of 6 M HCl was addedto 1 mL of cell culture broth for dissolving tyrosine crystals. Themixture was incubated at 37° C./250 rpm for 30 min, and then centrifugedat 13,500 g for 5 min. The supernatant was filtered by using 13 mm, 0.2μm Nylon filter.

To measure tyrosine titer, two microliters of the filtered supernatantprepared according to the above protocol was analyzed byhigh-performance liquid chromatography (HPLC, Shimadzu LC-10). The HPLCconditions are as follows: the column was Agilent ZORBAX Eclipse PlusC18 100 mm, an isocratic flow was used (the flow rate was 0.7 mL/min andthe mobile phase consists of 10% [v/v] acetonitrile and 90% [v/v]aqueous solution containing 0.1% [v/v] trifluoroacetic acid), the columntemperature was 30° C., and the detector was UV detector (wavelength:254 nm).

To measure coumaric acid titer, three hundred microliters of acidifiedmedium (without centrifugation) was mixed with 700 μL of acetonitrile,the mixture was incubated at 30° C. for 1 h, the mixture was centrifugedat 13,500 g for 5 min, and two microliters of the supernatant wasanalyzed by HPLC (Shimadzu LC-10). The HPLC conditions are as follows:the column was Agilent ZORBAX Eclipse Plus C18 100 mm, an isocratic flowwas used (the flow rate was 1 mL/min and the mobile phase consists of35% [v/v] acetonitrile and 65% [v/v] aqueous solution containing 0.1%[v/v] trifluoroacetic acid), the column temperature was 30° C., and thedetector was UV detector (wavelength: 285 nm).

K3 medium consisted of 89.8% (v/v) of K3 basal medium, 10% (v/v) carbonsource stock solution and 0.17% (v/v) K3 master mix. K3 basal medium wasprepared by dissolving 4 g of (NH₄)₂HPO₄ and 13.3 g of KH₂PO₄ in 1 L ofdeionized water, and autoclaving the solution. The carbon source stocksolution was 200 g/L glucose solution (autoclaved). K3 master mix wasprepared by mixing 2.5 mL of 0.1 M ferric citrate solution (autoclaved),1 mL of 4.5 g/L thiamine solution (filtrated through 0.2 μm filter), 3mL of 4 mM Na₂MoO₃ (autoclaved), 1 mL of 1000× K3 trace elements stocksolution (autoclaved) and 1 mL of 1 M MgSO₄ solution (autoclaved). Weprepared 1000× K3 trace elements stock solution by dissolving 5 g ofCaCl₂.2H₂O, 1.6 g of MnCl₂.4H₂O, 0.38 g of CuCl₂.4H₂O, 0.5 g ofCoCl₂.2H₂O, 0.94 g of ZnCl₂, 0.0311 g of H₃BO₃ and 0.4 g of Na₂EDTA.2H₂Oin 1 L of deionized water, and autoclaved this solution.

Culture and Analysis of GFP-Expressing E. coli

Each of plasmid A0-8 was used to transform E. coli BL21 (DE3) (C2527H,NEB). Single colony was inoculated into LB with 50 μg/mL ofspectinomycin and cultured at 37° C./250 rpm overnight. Fiftymicroliters of the overnight grown cell suspension was inoculated into 5mL of K3 medium with 50 μg/mL of spectinomycin, and the culture wasincubated in 50 mL Falcon tube at 37° C./250 rpm for 24 h. Opticaldensity 600 (OD600) of cell suspensions was determined by using amicroplate reader (Varioskan LUX Multimode Microplate Reader, ThermoFisher Scientific). For each sample, two hundred microliters of cellsuspension was loaded into a well of 96-well optical plate and assayedwith the following parameter setting: excitation wavelength was 483 nm,emission wavelength is 535 nm, measurement time was 100 ms, and thebandwidth of excitation and emission light was 12 nm. Fluorescencesignal was normalized by OD600 of cell suspension to calculate specificfluorescence signal.

Example 11: Further Discussion

Biotechnology is transforming how humans generate fuels, producechemicals, and treat diseases. Developing the needed technologies oftenrequires construction of plasmid, a vector for carrying geneticinformation. Currently, most researchers construct plasmids in a highlyinefficient way—they customize genetic materials, pay commercialcompanies to synthesize the materials, wait for many days, and oftenonly use them once. In this study, we report a standard (GT assemblystandard [GTas]) under which most functional DNA sequences (includingvery short and long ones) can be defined as standard parts, and a methodthat can assemble up to 14 of them into one plasmid in one round ofoperation. Based on 370 plasmids we have constructed, the averagedaccuracy of this plasmid construction method is 86%. The standard partscan be flipped and arranged in any order as long as a simple rule isfollowed, and there is no scar (junk DNA sequences) in most junctions ofthe parts, making it possible to standardize construction of almost anyplasmid. Plasmids constructed under this standard can also be easilyedited, and/or be further assembled into more complex plasmids by usingstandard oligonucleotides. GTas may lead to commercial standard DNAparts sold as catalogued chemicals and/or in research kit, which wouldlower cost of acquiring these materials for researchers, and enable ourcommunity to utilize its limited DNA synthesis power more efficiently.

Researchers working on biotechnology projects often order customized DNAoligonucleotides (oligos), which can only be used by the lab that placedthe order because the oligos are tailored for their specificapplications. The labs usually can only consume less than 1% of eacholigo they order even when the minimal quantity is requested—thesupplier has difficulty or has no incentive in scaling down synthesisscale. Because there is no mechanism for sharing oligos, many identicaloligos are being repeatedly synthesized. These together lead tosuboptimal use of the society's DNA synthesis power, which has alreadysubstantially lagged behind the DNA reading capability²⁸.

A solution to this problem is use of standard DNA parts, which has beenexplored but has yet been adopted widely by the whole biotechnologycommunity, possibly due to flaws in the previous designs. The mostwell-known standard of biological parts is BioBricks¹⁴, which has beenused since 2003 mainly by international Genetically Engineered Machine(iGEM)²⁹, a student competition in synthetic biology. Through thecompetition, BioBricks Foundation collects and distributes plasmids thatcarry standard parts, which can be combined to produce new standardparts with the help of restriction enzymes. This system is slow incombining parts together—every round only two parts can be combined. So,soon after the technologies that can assemble multiple DNA parts weredeveloped around 2009, including Gibson¹⁹ and Golden Gate³⁰ methods,research labs swiftly adopted them to speed up projects, and have,unfortunately, mostly used customized parts till to date.

In 2015, BASIC standard was developed⁷, which allowed use of standardparts in multi-pieces DNA assembly, but it has so far not been usedglobally possibly due to some of its limitations, including lowaccuracy, difficulty in reuse of constructed plasmids, and leaving largescars.

Here, we report a new plasmid construction standard (GTas) anddevelopment of new technologies for implementing it, which togetherovercome all the limitations of existing standards and allow much moreflexible arrangement of standard parts. GTas has the potential tosubstantially reduce the cost and time of plasmid construction inbiotechnological applications, and to improve the efficiency ofutilizing our DNA synthesis power.

Perspective on Transforming Plasmid Construction Practices in GlobalBiotechnology Community

GTas may lead to new and cheaper distribution mechanisms for sharingstandard parts in the global biotechnology community (FIG. 15a ). Asfull information of a barcode is coded in a set of DNA oligos, a largelibrary of barcodes can be easily provided by service providers at lowcost to individual researchers through commercial kits, which contain alarge number of standard oligos. There would be incentive for companiesto develop such kits, because one oligo they synthesize/order in itsminimal scale can be used for preparing at least 100 kits, making thecost of each oligo in a kit to be low and the profit margin of the kitto be reasonable. A kit can be optimized and developed for a popularapplication, such as “Engineering metabolism of E. coli”, or could becustomized by allowing users to choose oligos from a large library of acompany.

Distributing physical copy of a fragment requires two oligos and onetemplate DNA. The oligos may be distributed in a way similar to those ofbarcodes, though the number of oligos is large as fragments are diverse.The template DNA could be genomic DNA, complementary DNA, synthetic DNAor plasmid, which can also be included in commercial kits or sold ascatalogued chemicals.

The DNA sequence of these standard parts may be patented, but the partscan still be sold if a licensing clause is included in the salesagreement, which protects intellectual properties and streamlines thelicensing process. Plasmids constructed under GTas together with relatedoligos and other materials can also be shared among users directly orthrough service-providers (FIG. 15a ), such as Addgene. In such case,licensing terms need to be customized (“Free to use” or “Open access” isalso a customized term). These shareable materials from peers could beconveniently assimilated into any plasmid construction when GTas wasused. PCR products created by using Aoligos should also be assembled byusing some existing assembly methods, including Gibson^(A6) andIn-Fusion (Clontech) methods.

Through these new mechanisms, repeatedly synthesizing the same oligosand longer DNA molecules (e.g. genes) can be reduced, utilization rateof any synthesized DNA oligos may be maximized, and waiting time forcustomer would be shorten (standard parts are ready to ship when theyare in stock unlike customized parts that need manufacturing). As asupporting evidence, after we surveyed the 370 plasmids we haveconstructed so far in our lab under GTas, we found oligos associatedwith eleven barcodes have been reused for more than 50 times (FIG. 15b). Oligos associated with the most frequently used barcode (RBS1) havebeen used for up to 262 times. If this practice is adopted by acommunity, we would expect much more frequent reuse of standard parts.

REFERENCES

Any listing or discussion of an apparently prior-published document inthis specification should not necessarily be taken as an acknowledgementthat such document is part of the state of the art or is common generalknowledge.

-   1. Fernandez-Rodriguez, J., Moser, F., Song, M. & Voigt, C. A.    Engineering RGB color vision into Escherichia coli. Nat. Chem. Biol.    (2017). doi:10.1038/nchembio.2390-   2. Yim, H. et al. Metabolic engineering of Escherichia coli for    direct production of 1,4-butanediol. Nat Chem Biol 7, 445-452    (2011).-   3. Danino, T. et al. Programmable probiotics for detection of cancer    in urine. Sci. Transl. Med. 7, 289ra84 (2015).-   4. Eisenstein, M. Living factories of the future. Nature 531,    401-403 (2016).-   5. Check Hayden, E. Synthetic biology called to order. Nature    141-142 (2015). doi:10.1038/520141a-   6. Hillson, N. J., Rosengarten, R. D. & Keasling, J. D. J5 DNA    assembly design automation software. ACS Synth. Biol. 1, 14-21    (2012).-   7. Storch, M. et al. BASIC: A New Biopart Assembly Standard for    Idempotent Cloning Provides Accurate, Single-Tier DNA Assembly for    Synthetic Biology. ACS Synth. Biol. 4, 781-787 (2015).-   8. Ngo, A. H., Ibãnez, M. & Do, L. H. Catalytic Hydrogenation of    Cytotoxic Aldehydes Using Nicotinamide Adenine Dinucleotide (NADH)    in Cell Growth Media. ACS Catal. 6, 2637-2641 (2016).-   9. Salis, H. M., Mirsky, E. A. & Voigt, C. A. Automated design of    synthetic ribosome binding sites to control protein expression. Nat.    Biotechnol. 27, 946-50 (2009).-   10. Ajikumar, P. K. et al. Isoprenoid pathway optimization for Taxol    precursor overproduction in Escherichia coli. Science (80-.). 330,    70-74 (2010).-   11. Bassalo, M. C. et al. Rapid and Efficient One-Step Metabolic    Pathway Integration in E. coli. ACS Synth. Biol. 5, 561-568 (2016).-   12. Jinek, M. et al. A Programmable Dual-RNA—Guided DNA Endonuclease    in Adaptive Bacterial Immunity. Science (80-.). 337, 816-822 (2012).-   13. Jiang, Y. et al. Multigene editing in the Escherichia coli    genome via the CRISPR-Cas9. Appl. Environ. Microbiol. 81, 2506-2514    (2015).-   14. Shetty, R. P., Endy, D. & Knight Jr., T. F. Engineering BioBrick    vectors from BioBrick parts. J Biol Eng 2, 5 (2008).-   15. Engler, C., Gruetzner, R., Kandzia, R. & Marillonnet, S. Golden    gate shuffling: A one-pot DNA shuffling method based on type ils    restriction enzymes. PLoS One 4, (2009).-   16. Crook, N. C., Freeman, E. S. & Alper, H. S. Re-engineering    multicloning sites for function and convenience. Nucleic Acids Res.    39, (2011).-   17. Casini, A., Storch, M., Baldwin, G. S. & Ellis, T. Bricks and    blueprints: methods and standards for DNA assembly. Nat. Rev. Mol.    Cell Biol. 16, 568-576 (2015).-   18. Li, M. Z. & Elledge, S. J. Harnessing homologous recombination    in vitro to generate recombinant DNA via SLIC. Nat Methods 4,    251-256 (2007).-   19. Gibson, D. G. et al. Enzymatic assembly of DNA molecules up to    several hundred kilobases. Nat Methods 6, 343-345 (2009).-   20. Bitinaite, J. et al. USER™ friendly DNA engineering and cloning    method by uracil excision. Nucleic Acids Res. 35, 1992-2002 (2007).-   21. Casini, A. et al. One-pot DNA construction for synthetic    biology: the Modular Overlap-Directed Assembly with Linkers (MODAL)    strategy. Nucleic Acids Res. 42, e7 (2014).-   22. Shao, Z., Zhao, H. & Zhao, H. DNA assembler, an in vivo genetic    method for rapid construction of biochemical pathways. Nucleic Acids    Res. 37, 1-10 (2009).-   23. Nakamaye, K. L., Gish, G., Eckstein, F. & Vosberg, H. P. Direct    sequencing of polymerase chain reaction amplified DNA fragments    through the incorporation of deoxynucleoside    alpha-thiotriphosphates. Nucleic Acids Res. 16, 9947-59 (1988).-   24. Zou, R., Zhou, K., Stephanopoulos, G. & Too, H. P. Combinatorial    engineering of 1-deoxy-D-xylulose 5-phosphate pathway using    cross-lapping in vitro assembly (CLIVA) method. PLoS One 8, e79557    (2013).-   25. Zhou, K., Edgar, S. & Stephanopoulos, G. Engineering Microbes to    Synthesize Plant Isoprenoids. Methods in Enzymology 575, (Elsevier    Inc., 2016).-   26. Green and Sambrook, Molecular Cloning: A Laboratory Manual, Cold    Springs Harbor Laboratory, New York (2012).-   27. WO 2000/018967.-   28. Kosuri, S. & Church, G. M. Large-scale de novo DNA synthesis:    technologies and applications. Nat. Methods. 11, 499-499 (2014).-   29. Smolke, C. D. Building outside of the box: iGEM and the    BioBricks Foundation. Nat. Biotechnol. 27, 1099-1102 (2009).-   30. Engler, C., Kandzia, R. & Marillonnet, S. A one pot, one step,    precision cloning method with high throughput capability. PLoS. One.    3, e3647 (2008).-   31. Banks, C. A. at al. Proteins interacting with cloning scars: a    source of false positive protein-protein interactions. Sci. Rep. 5,    8530 (2015).-   32. Chen, X. & Zhang, J. Why are genes encoded on the lagging strand    of the bacterial genome? Genome. Biol. Evol. 5, 2436-2439 (2013).-   33. Smanski, M. J. et al. Functional optimization of gene clusters    by combinatorial design and assembly. Nat. Biotechnol. 32, 1241-1249    (2014).-   34. Santos, C. N., Xiao, W. & Stephanopoulos, G. Rational,    combinatorial, and genomic approaches for engineering L-tyrosine    production in Escherichia coli. Proc. Natl. Acad. Sci. USA. 109,    13538-13543 (2012).-   35. Anilionyte, O. at al. Short, auto-inducible promoters for    well-controlled protein expression in Escherichia coli. Appl.    Microbiol. Biotechnol. 102, 7007-7015 (2018).-   36. Fowler, Z. L. & Koffas, M. A. Biosynthesis and biotechnological    production of flavanones: current state and perspectives. Appl.    Microbiol. Biotechnol. 83, 799-808 (2009).-   37. Jendresen, C. B. et al. Highly Active and specific tyrosine    ammonia-lyases from diverse origins enable enhanced production of    aromatic compounds in Bacteria and Saccharomyces cerevisiae. Appl.    Environ. Microbiol. 81, 4458-4476 (2015).-   38. Zhou, Y. et al. MiYA, an efficient machine-learning workflow in    conjunction with the YeastFab assembly strategy for combinatorial    optimization of heterologous metabolic pathways in Saccharomyces    cerevisiae. Metab. Eng. 47, 294-302 (2018).-   39. Coussement, P. et al. One step DNA assembly for combinatorial    metabolic engineering. Metab. Eng. 23, 70-77 (2014).-   40. Kang, S. Y. et al. Artificial biosynthesis of phenylpropanoic    acids in a tyrosine overproducing Escherichia coli strain. Microb.    Cell. Fact. 11, 153 (2012).-   41. Kim, B. et al. Metabolic engineering of Escherichia coli for the    enhanced production of L-tyrosine. Biotechnol. Bioeng. 1-11.    https://doi.org/10.1002/bit.26797 (2018).-   42. Gao, D. et al. Identification of a heterologous cellulase and    its N-terminus that can guide recombinant proteins out of    Escherichia coli. Microb. Cell. Fact. 14, 49 (2015).

1. A method for ligating at least two nucleic acid molecules comprising:(i) providing a first nucleic acid molecule comprising a first overhangof at least one nucleotide in length at a first end; (ii) providing asecond nucleic acid molecule capable of forming a stem-loop structurewith an overhang of at least one nucleotide; wherein the overhang of thesecond nucleic acid molecule is substantially complementary to the firstoverhang of the first end of the first nucleic acid molecule; and (iii)ligating the first nucleic acid molecule to the second nucleic acidmolecule at the complementary overhangs to form a single nucleic acidmolecule.
 2. A method for ligating three nucleic acid moleculescomprising: (i) providing a first nucleic acid molecule comprising afirst overhang of at least one nucleotide in length at a first end and asecond overhang of at least one nucleotide of at least one nucleotide inlength at its other (or second) end; wherein the first overhang and thesecond overhang have different sequences and/or are not complementary toeach other; (ii) providing a second nucleic acid molecule capable offorming a stem-loop structure with an overhang of at least onenucleotide; wherein the overhang of the second nucleic acid molecule issubstantially complementary to the first overhang of the first end ofthe first nucleic acid molecule; and also providing a third nucleic acidmolecule capable of forming a stem-loop structure with an overhang of atleast one nucleotide; wherein the overhang of the third nucleic acidmolecule is substantially complementary to the second overhang of thesecond end of the first nucleic acid molecule; and wherein the overhangof the second nucleic acid molecule and the overhang of the thirdnucleic acid molecule have different sequences and/or are notcomplementary to each other; and (iii) ligating the first overhang atthe first end of the first nucleic acid molecule to the overhang of thesecond nucleic acid molecule and also the second overhang of the secondend of the first nucleic acid molecule to the overhang of the thirdnucleic acid molecule to form a single nucleic acid molecule.
 3. Themethod according to claim 1; wherein the second nucleic acid moleculecomprises a defined sequence.
 4. The method according to claim 3;wherein the defined sequence of the second nucleic acid comprises a tagsequence, barcode sequence and/or a linking sequence.
 5. The methodaccording to claim 2; wherein the second nucleic acid comprises a firstdefined sequence and the third nucleic acid comprises a second definedsequence.
 6. The method according to claim 5; wherein the first definedsequence of the second nucleic acid molecule comprises a first tagsequence, a first barcode sequence and/or a first linking sequence andthe second defined sequence of the third nucleic acid molecule comprisesa second tag sequence; a second barcode sequence and/or a second linkingsequence.
 7. The method according to claim 1; wherein step (i)comprises: (i)(a) providing a double-stranded nucleic acid templatecomprising a first nucleic acid strand and a second nucleic acid strandsubstantially reverse complementary to the first nucleic acid strand;(i)(b) providing a first primer comprising a first sequence with atleast one modified nucleotide upstream of a second sequencesubstantially complementary to the first strand of the nucleic acidtemplate; and a second primer comprising a sequence substantiallycomplementary to the second strand of the nucleic acid template; (i)(c)amplifying the nucleic acid template using the first and second primersin a polymerase chain reaction to produce an amplicon; and (i)(d)chemically cleaving the amplicon to produce the first nucleic acidmolecule comprising a first overhang of at least one nucleotide inlength at a first end; or (i)(a) providing a first single-strandednucleic acid molecule; (i)(b) providing a second single-stranded nucleicacid molecule substantially complementary to the first single nucleicacid molecule; and (i)(c) allowing the first and second single-strandednucleic acid molecule to anneal to produce the first nucleic acidmolecule comprising a first overhang of at least one nucleotide inlength at a first end.
 8. The method according to claim 2; wherein step(i) comprises: (i)(a) providing a double-stranded nucleic acid templatecomprising a first nucleic acid strand and a second nucleic acid strandsubstantially reverse complementary to the first nucleic acid strand;(i)(b) providing a first primer comprising a first sequence with atleast one modified nucleotide upstream of a second sequencesubstantially complementary to the second strand of the nucleic acidtemplate; and a second primer comprising a third sequence with at leastone modified nucleotide upstream of a fourth sequence substantiallycomplementary to the first strand of the nucleic acid template; (i)(c)amplifying the nucleic acid template using the first and second primersto produce an amplicon; and (i)(d) chemically cleaving the amplicon toproduce the first nucleic acid molecule comprising a first overhang ofat least one nucleotide in length at a first end and a second overhangof at least one nucleotide of at least one nucleotide in length at itsother (or second) end; wherein the first overhang and the secondoverhang have different sequences and/or are not complementary to eachother; or (i)(a) providing a first single-stranded nucleic acidmolecule; (i)(b) providing a second single-stranded nucleic acidmolecule substantially complementary to the first single nucleic acidmolecule; and (i)(c) allowing the first and second single-strandednucleic acid molecule to anneal to produce the first nucleic acidmolecule comprising a first overhang of at least one nucleotide inlength at a first end and a second overhang of at least one nucleotideof at least one nucleotide in length at its other (or second) end;wherein the first overhang and the second overhang have differentsequences and/or are not complementary to each other.
 9. The methodaccording to claim 1; wherein the number of nucleotides in the firstoverhang of the first nucleic acid molecule and the number ofnucleotides in the overhang of the second nucleic acid molecule are 1, 2or
 3. 10. (canceled)
 11. The method according to claim 2; wherein thenumber of nucleotides in the first overhang of the first nucleic acidmolecule and the number of nucleotides in the overhang of the secondnucleic acid molecule are 1, 2 or 3 and independently, the number ofnucleotides in the second overhang of the first nucleic acid moleculeand the number of nucleotides in the overhang of the third nucleic acidmolecule are 1, 2 or
 3. 12. (canceled)
 13. The method according to claim1, further comprising the steps of: (iv) using the single nucleic acidmolecule from step (iii) as a template for amplifying in a polymerasechain reaction with two amplification primers to produce an amplicon;and (v) performing a ligation to join the amplicon with at least oneother nucleic acid molecule to form an assembly comprising the ligatednucleic acid molecules.
 14. The method according to claim 13; whereinstep (iv) comprises amplifying the template with an amplification primercomprising at least one modified nucleotide and another amplificationprimer to produce the amplicon; chemically cleaving the amplicon toproduce an end with a third overhang; and step (v) comprises ligatingthe amplicon to at least one other nucleic acid molecule with a fourthoverhang substantially complementary to the third overhang.
 15. Themethod according to claim 13; wherein the at least one other nucleicacid molecule in step (v) is an amplicon from step (iv) using anothersingle nucleic acid molecule from step (iii) as a template.
 16. Themethod according to claim 13, wherein step (v) comprises ligating toform a concatemer of nucleic acid molecules, wherein each nucleic acidcomprises substantially the same sequence.
 17. The method according toclaim 13, wherein the assembly comprising the ligated nucleic acidmolecules is circular.
 18. (canceled)
 19. The method according to claim1 or 2, further comprising the steps of: (iv) using the single nucleicacid molecule as a template from step (iii) for amplifying in apolymerase chain reaction with two amplification primers to produce anamplicon; (v) performing a ligation to join the amplicon to a pluralityof nucleic acid molecules to form an assembly of joined plurality ofnucleic acid molecules.
 20. The method according to claim 19; whereinstep (iv) comprises amplifying the template with an amplification primercomprising at least one modified nucleotide and an amplification primercomprising at least one modified nucleotide; further chemically cleavingthe amplicon to produce a first end with a third overhang and a secondend with a fourth overhang.
 21. The method according to claim 19 or 20;wherein each of the plurality of nucleic acid molecules is an ampliconfrom step (iv) using another single nucleic acid molecule from step(iii) as a template.
 22. The method according to any one of claims 19 to21, wherein the amplicon and each of the plurality of nucleic acidmolecules have different sequences.
 23. The method according to any oneof claims 19 to 21, wherein step (v) comprises ligating to form aconcatemer of nucleic acid molecules, each with substantially the samesequence.
 24. The method according to claim 19, wherein the assembly ofjoined plurality of nucleic acid molecules is circular.
 25. (canceled)26. The method according to claim 5 or 6, further comprising the stepsof: (iv) using the single nucleic acid molecule from step (iii) as atemplate for amplifying in a polymerase chain reaction with anamplification primer having a sequence designed based on at least partor all of the first defined sequence and comprising at least onemodified nucleotide and another amplification primer having a sequencedesigned based on at least part or all of the second defined sequenceand comprising at least one modified nucleotide to produce an amplicon,chemically cleaving the amplicon to produce a first end with a thirdoverhang and a second end with a fourth overhang; and (v) performing aligation to join the amplicon to a plurality of nucleic acid moleculesto form an assembly of joined plurality of nucleic acid molecules;wherein each of the plurality of nucleic acid molecules is an ampliconfrom step (iv).
 27. The method according to claim 26, wherein each ofthe plurality of nucleic acid molecules is an amplicon from step (iv)using another single nucleic acid molecule from step (iii) as atemplate.
 28. The method according to claim 26 or 27, whereinamplification primers designed based on defined sequences of a saidsecond nucleic acid molecules comprising a stem-loop structure asapplicable are used to order and/or arrange the plurality of nucleicacid molecules in the assembly.
 29. The method according to any one ofclaims 26 to 28, wherein the amplicon and each of the plurality ofnucleic acid molecules have different sequences.
 30. The methodaccording to any one of claims 26 to 29, wherein the assembly of joinedplurality of nucleic acid molecules is circular.
 31. (canceled)
 32. Themethod according to claim 30, wherein the method further comprises usingpolymerase chain reaction with amplification primers designed based onapplicable defined sequences to implement a modification in the assemblyof joined plurality of nucleic acid molecules, wherein the modificationcomprises inserting at least one nucleic acid molecule into theassembly, removing at least one joined nucleic acid molecule from theassembly or replacing at least one joined nucleic acid molecule in theassembly.
 33. The method according to claim 32, comprising implementingthe modification to form a library of different plasmids.
 34. (canceled)35. A nucleic acid molecule comprising of a defined sequence capable offorming a stem-loop structure with an overhang of at least onenucleotide.
 36. The nucleic acid molecule according to claim 35; whereinthe defined sequence comprises a tag sequence, a barcode sequence and/ora linking sequence.
 37. (canceled)
 38. (canceled)
 39. A kit comprising aplurality of nucleic acid molecules; each with a defined sequencecapable of forming a stem-loop structure with an overhang of at leastone nucleotide.
 40. The kit according to claim 39; wherein each definedsequence independently comprises a tag sequence, a barcode sequenceand/or a linking sequence.
 41. (canceled)
 42. (canceled)
 43. The kitaccording to claim 39, further comprising one or a plurality ofoligonucleotide(s).
 44. The kit according to claim 43, wherein said oneoligonucleotide or each oligonucleotide from the plurality ofoligonucleotides is capable of annealing to a defined sequence of atleast one of the plurality of nucleic acid molecules.
 45. (canceled)